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This  study  investigates  properties  of  tests  based  on  Pearson's 
correlation  coefficient  and  Kendall's  tau,  the  two  most  widely  used 
measures  of  correlation.  The  main  problem  of  interest  is  the  partial 
correlation  problem  where  the  variables  Y and  Z are  related  through 
another  variable,  the  covariate  X.  In  this  work  each  of  Y and  Z is 
related  to  X through  the  models 

Y = + 6^X  + E 

and 

Z = + ^2^  + E 

The  hypotheses  of  interest  are 

I 

1)  H^:  £ and  E are  independent, 

and 

2)  H^:  T = 0 , 

where  f is  Kendall's  correlation  coefficient  between  E and  E*. 


Vi 


For  the  first  hypothesis,  Kendall's  tau  calculated  on  the 
residuals  from  estimates  of  the  above  models,  is  proposed.  The 
properties  of  this  statistic  and  its  asymptotic  efficiency  relative 
to  the  Pearson  partial  correlation  coefficient  are  discussed.  Also, 
the  simulated  distribution  of  this  statistic  under  the  null 
hypothesis  of  independence  is  tabulated. 

The  null  hypothesis  t = 0 is  first  investigated  under  the 
ordinary  correlation  setting  between  Y and  Z,  i.e.,  in  the  absence  of 
the  covariate  term  X.  Here,  a test  is  proposed  based  on  the  usual 
Kendall's  tau  but  standardized  by  a variance  estimator  which  has 
better  properties  than  the  estimators  discussed  in  the  literature. 

The  simulated  null  distribution  of  this  statistic  is  also  given. 

For  the  partial  correlation  formulation  using  a null  hypothesis 
T = 0,  a statistic  is  proposed  which  is  similar  to  one  studied  for 
the  ordinary  correlation  problem  except  that  it  is  applied  to  the 
residuals  from  the  fitted  model.  The  simulated  null  distributions  of 
this  statistic  generated  from  residuals  obtained  by  the  least  squares 
model  estimates  and  by  least  absolute  regression,  respectively,  are 
also  tabulated. 

Results  of  a Monte  Carlo  study  investigating  the  performances  of 
the  above  statistics  indicate  that 

(i)  for  hypotheses  of  independence,  tests  based  on  Pearson's 
statistics  are  highly  robust  in  both  the  ordinary  correlation, 
and  the  partial  correlation  settings,  and  that 

(ii)  in  both  settings,  the  tests  based  on  our  proposed  modifications 
of  Kendall's  tau  perform  the  best  overall  for  the  hypothesis 
that  T = 0. 


CHAPTER  ONE 
INTRODUCTION 


Let  (X,Y,Z)  denote  a random  variable  from  some  absolutely 
continuous  trivariate  distribution  with  distribution  function  F,  and 
consider  testing  the  null  hypothesis  that  Y and  Z are  independent.  If 
this  hypothesis  is  rejected,  one  tends  to  believe  that  the  variables  Y 
and  Z are  dependent.  However,  it  is  possible  that  this  "dependence" 
between  Y and  Z is  due  to  the  effect  of  another  variable  X to  which 
both  Y and  Z are  related  in  some  fashion.  If,  for  example,  Y is  a 
variable  measuring  mathematical  ability  and  Z is  a variable  measuring 
musical  ability,  then  a significant  correlation  between  Y and  Z is 
perhaps  due  to  the  correlation  of  each  of  Y and  Z with  another 
variable  X which  measures  intelligence.  If  one  suspects  that  such  a 
relationship  exists,  then  a more  appropriate  test  may  be  what  is 
commonly  known  as  the  test  for  partial  correlation,  where  the  null 
hypothesis  is  given  by 

HqI  Y and  Z are  independent 

(1.1) 

conditional  on  X being  held  constant. 

That  is  to  say,  one  "partials  out"  the  effect  of  the  variable  X while 
testing  the  independence  between  Y and  Z. 
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Although,  in  general,  almost  any  relational  structure  between  Y 
and  X and  between  Z and  X is  possible;  we  use  linear  models  as  the 
underlying  structure  relating  these  variables.  That  is,  we  let 

Y = + B^X  + E 

and  (1.2) 

1=0^+  $2^  + E‘  , 

where  the  regression  parameters  02,  and  ^2  unknown 
constants,  and  the  random  variable  X is  independent  of  both  variables 
E and  E'.  Our  choice  of  the  linear  structure  was  dictated  by  the  fact 
that  the  normal  theory  procedures  discussed  in  our  work  assume  such  a 
structure.  For  example,  the  use  of  Pearson's  partial  correlation 
coefficient  (to  be  discussed  later)  is  inappropriate  unless  both  Y and 
Z have  linear  regressions  on  X (see,  for  example,  Quade,  1974,  p.  376 
and  Korn,  1984,  p.  62).  Under  the  linear  models  given  in  (1.2),  the 
hypothesis  of  (1.1)  is  equivalent  to 

Hq:  E and  E'  are  independent  . (1.3) 

The  most  popular  test  of  partial  correlation  is  that  based  on 
Pearson's  partial  correlation  coefficient  commonly  denoted  by  Ry^  x 
and  given  by 


'YX.X 


Ryz  - ‘^YX'^ZX 


{[l-RyxlCl-R^X^^ 


^/2 


(1.4) 
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where  Ryz»  usual  product  moment  correlation 

coefficients  between  Y and  Z,  Y and  X,  and  Z and  X,  respectively. 

That  is,  if  (Xj^.Y^.Z]^),  . . . , (Xf,,Y^,Z^)  denotes  a random  sample  of 
size  n from  F,  then,  for  example, 

j (Yj-Y)(Zj-Z) 

D s T 

t\yjy  . 

n n 1/2 

( I (Y,-Y)2  I (Z  -Z)2} 
i=l  ^ i=l  ^ 

The  intuitive  appeal  of  the  statistic  Ry^  x arises  from  the  fact  that 
Ryz.x  nothing  but  the  usual  product  moment  correlation  coefficient 
(Pearson's  R)  calculated  from  the  residuals  of  the  ordinary  least 
squares  fit  of  the  linear  models  given  in  (1.2).  However,  a 
disadvantage  of  using  tests  based  on  f^YZ.X*  which  henceforth  we  shall 
denote  by  R^,,  is  that  they  all  assume  that  either  E|e'  or  e'|E  is 
normally  distributed.  These  tests  may  be  nonrobust  without  this 
assumption,  an  issue  to  be  investigated  in  this  work. 

Another  measure  for  partial  correlation,  albeit  not  as  popular, 
is  the  nonparametric  Kendall's  partial  correlation  coefficient  given 
by 


^YZ.X 


~^YZ  ' '^YX'^ZX 
{[i-tyx]l1“Tzx3} 


(1.5) 


where  Xy^,  Xy^^  and  x^^  are  the  usual  Kendall's  correlation 
coefficients  (Kendall's  tau)  calculated  on  the  variables  Y and  Z,  Y 
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and  X,  and  Z and  X,  respectively.  That  is,  for  example. 


where 


f 


1 if  t > 0 


Sgn{t}  = <0  if  t = 0 . 


(1.6) 


-1  if  t < 0 


Kendall  (1962)  obtained  the  statistic  ■'■yz.X’  ^^so  as  Kendall's 

partial  tau,  as  follows.  For  a fixed  ranking  of  the  variable  X,  he 
chose  two  random  rankings  of  the  variables  Y and  Z.  For  all  possible 
(2)  pairs  (X^- ,Y^- ,Z^- ) and  (Xj,Yj,Zj),  is^j,  he  constructed  a 2x2 
contingency  table  in  which  one  category  contained  the  freqencies  of 
agreement  (disagreement)  of  the  Y pairs  with  the  X pairs,  and  the 
other  category  contained  those  of  the  Z pairs  with  the  X pairs.  From 
this  table,  Kendall  calculated  the  measure  of  association  commonly 
known  as  Kendall's  tau~b.  Writing  the  appropriate  frequencies  in 
terms  of  Ty^,  Xy^^  and  he  then  obtained  the  partial  tau  statistic 
given  in  (1.5).  We  have  briefly  described  Kendall's  method  of 
obtaining  the  statistic  fy^^x  stress  an  important  fact  and  that  is 
that  Tyz.x  is  not  the  usual  Kendall's  tau  calculated  on  the  residuals 
obtained  from  the  linear  models  (1.2),  and  that,  although  ty^  ^ has 
the  same  mathematical  structure  as  Ry^.x*  is  merely  a coincidence. 

The  lack  of  popularity  of  Kendall's  partial  tau  stems  from  the 
fact  that  it  has  many  limitations  which  are  primarily  due  to  its 
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theoretically  complex  structure.  It  is  not  distribution-free,  for 
example,  and  in  fact,  it  is  not  even  asymptotically  distribution-free 
(its  asymptotic  variance  depends  on  the  underlying  distribution  of  the 
variable  (X,Y,Z)).  Magsoodloo  (1975)  and  Magsoodloo  and  Pallos  (1981) 
have  tabulated  quantile  estimates  of  a null  distribution  for  x 

based  on  Monte  Carlo  simulations  for  a variety  of  sample  sizes.  We 
believe  that  these  quantile  estimates  are  inappropriate  for  testing 
conditional  independence  since  they  were  generated  under  the 
hypothesis  of  "total  independence,"  that  is  under  the  assumption  that 
the  three  variables  X,  Y and  Z are  mutually  independent.  In  some 
preliminary  Monte  Carlo  studies,  we  used  these  quantile  estimates 
under  the  underlying  model  structure  (1.2).  As  we  had  expected,  the 
empirical  sizes  of  such  tests  were  highly  inflated  under  the  less 
restrictive  hypothesis  of  conditional  independence.  For  example,  for 
each  of  10,000  samples  of  size  n=20  each  we  have  calculated  ty^  x 
the  variables  X,  Y=X+E  and  Z=X+E‘,  where  the  mutually  independent 
standard  normal  variables  X,  E and  £'  were  generated  by  IMSL 
subroutines.  For  a nominal  a=0.05,  each  of  the  10,000  statistics  was 
compared  to  the  95^*^  percentile  estimates  given  by  Magsoodloo  and 
Pallos  (1981).  The  relative  frequency  of  rejection  was  found  to  be 
0.138,  which  indicates  that  Magsoodloo  and  Pallos's  procedures  do  not 
hold  their  significance  levels  well  under  a conditional  independence 
model . 

To  test  the  hypothesis  of  independence  of  E and  E'  of  (1.2),  we 
propose  using  Kendall's  tau  calculated  on  the  residuals.  If  ay  02» 

A A 

and  S2  denote  estimates  of  the  regression  constants  ay  ay 


0 


and  respectively,  the  residuals  are 

U.  = Y.  - a,  - 6,X. 

1 1 1 1 1 

and  (1.7) 

A A 

V = " “2  ” ®2^i  ’ i=1.2,...,n  , 

and  the  test  statistic  is  given  by 


T 


n 


^ I Sgn{(U  -U.)(V  -V  )}  . 
(2)  i<j  1 J 1 J 


(1.8) 


with  Sgn(t)  is  as  defined  in  (1.6). 

The  idea  of  using  Kendall's  tau  calculated  from  residuals  was 
considered  by  Shirahata  (1977).  In  his  brief  paper,  Shirahata  tried 
to  show  that  the  difference  between  a standardized  and  a 
standardized  converges  to  zero  in  probability,  where  is  the 
usual  Kendall's  statistic  calculated  on  the  variables  E and  £'.  His 
method  of  argument  is  to  show  via  Monte  Carlo  simulation  that,  for 
large  n,  the  correlation  between  and  becomes  large  while  the 
sample  mean  of  12(T^-Sj^)^/{2n(n-l) (2n+5) } ^^^2  becomes  small.  From 
these  considerations  he  concludes  that  the  approximation  of  to 
is  satisfactory  for  large  n.  Randles  (1984)  also  considers  applying 
Kendall's  tau  to  residuals;  however,  his  discussion  assumes  the  X^-'s 
of  (1.2)  to  be  known  constants  rather  than  random  variables  as  they 
are  considered  to  be  here. 

In  our  study,  we  compare  the  performances  of  tests  based  on  T to 

n 

those  based  on  the  Pearson's  partial  correlation  coefficient  R . The 
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statistic  Ty2  x included  in  this  study  because  of  the  many 

previously  discussed  disadvantages  associated  with  it.  There  are  many 
advantages  to  using  T^.  For  example,  is  asymptotically 
distribution-free  under  the  model  (1.2)  and  the  hypothesis  of 
conditional  independence.  Further,  has  many  desirable  properties 
regardless  of  the  type  of  regression  parameter  estimators  used.  Also, 
calculations  of  asymptotic  relative  efficiencies  (AREs)  indicate  that, 
for  heavy-tailed  distributions  and  for  large  n,  tests  based  on  have 
higher  relative  efficiencies  than  those  based  on  These  properties 
will  be  discussed  in  detail  in  chapters  2 and  3.  In  chapter  2,  we 
discuss  the  distributional  properties  of  our  statistic  T^  under  the 
hypothesis  of  conditional  independence,  and  tabulate  the  simulated 
null  distribution  of  T^  when  X,  E and  E'  have  normal  distributions. 

In  chapter  3,  we  derive  an  expression  for  the  asymptotic  efficiency  of 
T^  relative  to  Rp  [ARE(Tp,Rp)],  where  the  class  of  alternatives  of 
dependence  between  E and  E'  is  given  by  the  "tri variate  reduction" 
model 


E = W.  + A W. 

1 n 3 

and  (1.9) 

E'  . Wj  + Vs  • 

where  W^,  W2  and  W3  are  absolutely  continuous  and  mutually  independent 
random  variables  and  Ap  is  a constant. 

In  chapter  4,  we  temporarily  turn  our  attention  from  the  partial 
correlation  problem  to  a different,  yet  closely  related  problem:  that 
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of  ordinary  correlation.  Here,  the  problem  of  interest  is  to  study 
the  association  between  the  variables  Y and  Z based  on  a random  sample 
of  pairs  ....  (Yp.Zp)  from  some  bivariate  continuous 

distribution  F.  This  problem  is  commonly  known  as  the  test  for 
independence  since  the  available  testing  procedures  based  on 
statistics  such  as  Hoeffding's  D,  Pearson's  R,  Spearman's  rho  and 
Kendall's  tau  all  test  the  null  hypothesis  of  independence, 

Hq:  Y and  Z are  independent  . 

Although  the  hypothesis  of  independence  implies  many  desirable  and 
convenient  theoretical  properties,  it  is  our  view  that,  despite  its 
intuitive  appeal,  such  a hypothesis  is  not  broad  enough  to  encompass 
all  situations  when  no  association  exists  between  the  variables  Y and 
Z.  Suppose,  for  example,  that  the  pair  (Y,Z)  has  a spherically 
symmetric  distribution  with  contours  of  the  form  given  in  figure  1.1 
(see,  for  example,  Johnson  and  Ramberg,  1977). 


Z 


Figure  1.1  Contours  of  a spherically  symmetric  distribution 
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Although  Y and  Z may  be  statistically  dependent  in  such  cases,  they 
are  clearly  uncorrelated  by  all  usual  definitions  of  correlation. 
Moreover,  larger  values  of  Y are  not  associated  with  larger  (or 
smaller)  values  of  Z,  etc.  It  is  situations  such  as  these,  when  there 
is  no  correlation  between  Y and  Z,  that  we  like  to  include  in  the  null 
hypothesis.  Indeed,  some  prominent  textbooks  state  their  null 
hypothesis  as  ^=0,  but  they  calculate  the  null  distribution  under 
independence,  not  just  ^=0,  where 


= p{(Y^-Y2)(Z^-Z2)>0}  - p{(Y^-Y2)(Z^-Z2)<0} 

= Probability  of  concordance 

- probability  of  discordance. 


(1.10) 


It  is  our  contention  that  the  experimenter  often  only  wishes  to 
detect  useful  relationships  between  Y and  Z where  Y,  for  example,  is 
useful  as  a predictor  of  Z or  where,  for  example,  larger  Y-values  are 
associated  with  larger  (or  smaller)  Z-values,  etc.  Correlation 
coefficients  such  as  attempt  to  measure  these  useful  relationships. 

Of  the  tests  mentioned  earlier,  Hoeffding's  D (see,  for  example, 
Hollander  and  Wolfe,  1973),  which  is  consistent  against  all  types  of 
dependence,  is  not  used  as  often  as  Pearson's  R or  Kendall's  tau. 

This  is  partly  because  it  is  more  difficult  to  compute  and  interpret, 
and  partly  due  to  its  ability  to  detect  all  departures  from 
independence,  which  makes  it  less  powerful  at  detecting  correlated 
departures.  In  addition  to  the  fact  that  the  respective  consistency 
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classes  of  the  tests  based  on  Pearson's  R and  Kendall's  tau  are  given 
by  p^O  and  ij^O,  our  interest  in  detecting  such  alternatives  derives 
from  the  fact  that  it  is  these  alternatives  that  allow  us  to  conclude 
useful  relationships  between  Y and  Z.  In  view  of  the  above,  we  would 
prefer  to  test  the  null  hypothesis  of 

Hq:  No  correlation  versus  Correlation  , (1.11) 

viewing  this  as  a test  of  a non-useful  versus  a useful  relationship 
between  the  two  variables. 

To  us,  the  most  natural  and  intuitive  type  of  correlation  is  the 
coefficient  t given  in  (1.10).  The  corresponding  hypothesis  of 
interest  is 


Hq:  t = 0 versus  ^ Q , (1.12) 

or  the  one-sided  alternate  hypotheses  of  positive  correlation  (x>0)  or 
negative  correlation  (t<0).  Note  that  under  Hq,  the  probability  of 
concordance  equals  the  probability  of  discordance,  so  that  there  is  no 
correlation  between  Y and  Z,  in  the  sense  that  one  variable  does  not 
increase  or  decrease  with  the  other  variable.  Of  course,  when  Y and  Z 
are  independent,  t=0,  so  that  if  one  rejects  the  null  hypothesis  of 
(1.12),  one  can  safely  conclude  that  the  variables  Y and  Z are  indeed 
dependent,  and  the  dependence  is  a useful  one  at  least  in  the  sense  of 
predicting  direction. 
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As  we  mentioned  earlier,  many  authors  of  statistics  textbooks 
such  as  Agresti  and  Agresti  (1979)  and  Ott,  Larson  and  Mendenhall 
(1983)  in  testing  the  hypotheses  of  (1.12)  base  their  rejection  of  Hq 
on  the  quantity 


Z — . (1.13) 

r 2(2n+5)  1 ^2 

'■  '^nln-iy 

A 

where  t is  Kendall's  estimate  of  t given  by 


^ =—  I 

(2)  i<j 


Sg.((Y,-Y^)(Z,-Zj)l 


We  believe  that  such  a test  is  inappropriate  even  for  large  n since 
the  denominator  of  (1.13)  is  the  standard  deviation  of  x under  the 
more  restrictive  hypothesis  of  independence.  Our  suspicions  of  the 
inappropriateness  of  such  procedures  were  supported  by  our  Monte  Carlo 
studies  where  we  found  that,  in  some  cases  when  t=0  but  Y and  Z are 
possibly  dependent,  the  empirical  a-levels  were  highly  inflated, 
indicating  that  this  procedure  was  not  maintaining  its  a-level  over 
the  broad  class  of  distributions  for  which  x=0. 

In  chapter  4,  we  review  and  evaluate  the  different  procedures 
available  for  testing  (1.12).  In  particular,  we  discuss  the  procedure 
recommended  by  Fligner  and  Rust  (1983)  and  highlight  its  limita- 
tions. Then,  we  propose  a statistic  similar  to  the  one  given  in 
Fligner  and  Rust  but  which  has  more  desirable  properties.  The 
performances  of  all  of  these  procedures  are  then  investigated  by  a 
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Monte  Carlo  study.  The  results  of  this  study  and  a summary  of 
conclusions  and  recommendations  are  given  at  the  end  of  chapter  4. 

In  chapter  5,  we  return  to  the  partial  correlation  problem  in  an 
effort  to  investigate  the  performances  of  the  tests  based  on  the 
statistics  T^,  and  some  of  the  statistics  studied  in  chapter  4 but 
this  time  applied  to  the  residuals.  Through  a Monte  Carlo  study,  the 
empirical  powers  and  sizes  of  seven  different  statistics  are  compared, 
both  under  the  hypothesis  of  independence  and  the  hypothesis  that 
T=0.  In  each  case,  the  residuals  are  obtained  by  two  different 
methods  of  regression  parameter  estimation:  (i ) the  ordinary  least 

squares  method,  and  (ii)  the  method  of  least  absolute  regression.  The 
tables  of  results  appear  throughout  chapter  5 followed  by  our 
conclusions  and  recommendations.  A list  of  related  topics  for  future 
study  appears  at  the  end  of  chapter  5. 


CHAPTER  TWO 

PROPERTIES  OF  THE  STATISTIC  T„ 

2. 1 Introduction 

Let  (Xj^,  Zj^),  (X2,  Y2,  Z2),  ....  (Xf,,  Yf^,  Z^)  denote  a 

random  sample  of  observable  triples  from  some  absolutely  continuous 
distribution,  with  distribution  function  F(*),  and  let  (X,  Y,  Z)  be 
distributed  as  (Xj^,  Yj^,  Zj^).  To  test  the  conditional  independence  of 
Y and  Z,  holding  X constant,  we  shall  assume  that  each  of  Y and  Z is 
linearly  related  to  X as  follows. 


and 


Y. 

= a,  + 

0,X. 

+ E. 

1 

1 

1 1 

1 

Z. 

= + 

6^X. 

1 

+ E. 

1 

2 

2 1 

1 

(2.1.1) 


where  and  S2  unknown  parameters  which  need  to  be 

estimated.  Here,  Xj^,  X£,  . . . , X^,  which  will  be  referred  to  as  the 
"covariate  terms,"  are  independent  identically  distributed  (i.i.d.) 
random  variables  with  an  absolutely  continuous  distribution  function 
Fj((‘),  mean  ^nd  variance  0^.  The  "error  terms"  (E^-,  E^'), 
i = 1,  2,  . . . , n,  are  i.i.d.  absolutely  continuous  bivariate 
random  variables.  The  respective  marginal  distribution  of 
(E^l)  is  assumed  to  have  mean  zero,  distribution  function 

H^(-)  (H2(*))  and  variance  Further,  it  will  be  assumed  that 

X^.  is  independent  of  (E^.,  e!  ),  i = 1,  2,  . . . . n. 
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The  hypothesis  of  interest  is 

I 

E^.  and  E^.  are  independent,  i =1,  2,...,  n. 

Our  proposed  test  statistic,  T^,  is  the  Kendall's  tau  statistic 
applied  to  the  residuals  (i.e.,  to  the  estimates  of  the  unobservable 
error  terms,  E;^,  E£,  . . . , and  z[,  E^,  . . . , E^).  If  «2. 

A A 

and  ^2  denote  the  estimates  of  «2»  ^2»  respectively, 

the  residuals  are  given  by 


and 


u. 

= Y. 

- a - 

B,X. 

1 

1 

1 

1 1 

V. 

= z. 

A 

- ■ 

B„X. 

1 

1 

2 

2 1 

(2.1.2) 


and  the  proposed  test  statistic  is 


where 


T 

n 


I Sgn[(U.-U.)(V.-V.)], 
(2)  i<j  1 J 1 J 


Sgn(t)  = 


1 if  t > 0 

0 if  t = 0 

-1  if  t < 0 . 


(2.1.3) 


In  the  sections  to  follow,  we  shall  discuss  the  properties  of  this 
statistic.  In  section  2.2  it  will  be  shown  that  the  distribution  of 
is  free  of  the  regression  constants  ®2* 

location  parameter  and  the  scale  parameters  and  <J^,, 

AAA 

provided  that  the  estimates  of  the  regression  constants  a^, 

A 

and  &2  satisfy  certain  general  properties.  In  section  2.3  the  small 
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sample  moments  and  the  symmetric  distribution  of  under  the  null 
hypothesis  of  independence  will  be  discussed.  In  section  2.4  the 
asymptotic  distribution  of  under  Hq  will  be  developed.  Section  2.5 
will  contain  the  tables  of  the  small  sample  null  distribution  of  the 
Tf,  statistic  as  generated  by  a Monte  Carlo  simulation  study  when  the 
Xj's,  E-j's,  and  E^'s  are  normally  distributed. 

2.2  The  Effects  of  Parameters  on  the  Distribution  of  Tp 
Unlike  the  usual  Kendall's  tau  statistic,  is  not  a 
distribution-free  statistic  even  under  the  hypothesis  of  the 
independence  of  the  "error  terms."  Its  distribution  depends  on  the 

I 

distribution  of  the  X^-'s,  E^-'s  and  E-j's,  i = 1,  2,  . . . , n.  To  see 
this,  we  write  the  residuals  given  in  (2.1.2)  in  terms  of  the  error 
terms  to  obtain 


= X^.  + E^.  , 

and  similarly,  (2.2.1) 

V.  = - (e^-S^)  X.  + Ej  , i = 1,  2,  ...,  n. 

The  statistic  T^  is  the  Kendall's  correlation  coefficient  (Kendall's 
tau)  calculated  on  the  pairs 


(U.,v.)  = [(a^-a^)-(6^-6^)X.+E.  , (a2~a2  )-(32-S2  )X^-+E  I ],  (2.2.2) 


1 “ 


n. 


16 


The  distribution-free  property  of  the  usual  Kendall's  statistic  under 
Hq  results  from  the  fact  that  under  the  hypothesis  of  independence  the 
two  elements  of  the  pair  are  exchangeable,  and  there  is  independence 
between  pairs.  However,  in  the  set-up  considered  here,  the  two 
elements  of  the  pair  are  not  exchangeable  due  to  the  presence  of  the 
X^-'s  in  both  elements.  (Note  that  in  (2.2.1)  the  X^-'s  appear  both 

AAA  A 

explicitly  and  implicitly  through  the  estimators  02»  ^1» 

Although  the  statistic  is  not  distribution-free,  its  distribu- 
tion does  not  depend  on  the  parameters  “i,  ^1»  ^2»  ^X» 

under  "translation"  and  "scale"  properties  to  be  discussed  later. 

A A 

The  statistic  T^^  is  free  of  the  terms  a^,  012,  and  since 
these  quantities  are  cancelled  out  by  taking  the  differences  of  the 
residuals.  Writing 


Sgn  {(U.-U.HV.-V.)} 

= Sgn  {[(E.-Ej)-(8^-3^)(X.-X^)J[(E^.-Ej)-(62-S2nX.-Xj)J}, 


we  see  that 


T.  = 


-j-  I Sgn{[(E  -E  )-(3.-Bj(X  -X  )][(e!-E‘.)-(L-8J(X.-X,)]}. 

(")  i<j  ^ J 1 1 1 j 1 j 2 2 1 j J 

(2.2.3) 


Thus,  without  loss  of  generality,  the  intercept  terms 
taken  to  be  zero.  Furthermore,  the  distribution  of  T^  is  free  of  the 
location  parameter  For,  if  ^ 0,  consider  the  transformed  zero- 
mean  random  variables  X?  = X^.  - y^,  i = 1,  2,  . . . , n.  The 
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underlying  model  may  now  be  written  as 


and 


i 1»  2,,.,,  n t 


Y,  = t 6j(X.  + u^)  t E. 


= a;  + 8.x*  + E.  , 
1 1 1 1 


Z,  = a'  . S^x,  . t;  . 


where 


a|  = oil  + and  ^ 


I I 

By  the  preceding  argument,  is  free  of  cij^  and  oi^*  is  therefore 
free  of  the  location  parameter 

To  ensure  that  the  distribution  of  is  free  of  the  remaining 

7 7 7 

parameters  8]^,  82.  ‘^E’’  sufficient  that  the  slope 

A A 

estimators  8^  and  82  satisfy  the  following  properties. 

“Translation"  property  2.2.4 


Assume  each  8^-  , i =1,  2,  satisfies 


''  A 

8^- (Xj^, . . . ,x^;  yj^+cxj^, . . . , Y^^+cx^)  = 8.j  (x^^, . . . ,x^;  y^.^^^.y^)  + c 
for  every  Xj^,  . . . , x^,  y^»  • • • » y^j  and  c. 


"Scale"  property  2.2.5 

A 

Assume  each  8^.  , i = 1,  2,  satisfies 

^ 1 A 

8i(aXi»...,aXnJ  by^^,...,  by^) 
for  every  x^.  . . . , x^,  y^.  . . . , y^. 


b and  a ^ 0. 
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From  expression  (2.2.3),  we  can  see  that  the  statistic 
involves  quantities  of  the  form 


and 


- Sj)  X, 


E;  - 1«2  - 83)  X, 


(2.2.6) 


Applying  property  2.2.4  with  c = c = -^2*  '"espectively,  we 

obtain 


, . , . , , y 2.  ’ * * ’ ^ » • • • > » y 2”  * *y pi” ^ 

and 

A A 

^2  ^ * * * * * ^n  * ^ 1 * * * * * ^n  ^ ~ ^2  ” * * * *^n*  ^1~^2^1*  * * * *^n"”^l^n^ 


so  that  the  quantities  (B^^  -0^^)  and  (B2  - B2)  may  be  replaced  by 


and 


^ (X^,...,X^;  Y^-B^X^,...,Y^-0^X^) 


^2  ^1  ”^2^1  ’ * * * ’ ^n'"^2^n  ^ ’ 


without  changing  the  value  of  Tp.  These  new  estimators  B*  and  B^  are 
the  slope  estimators  obtained  by  replacing  Y.j  by  Y2~B.jX.j  and  by 
^2^i»  respectively,  in  the  model  structure 


Y. 

1 


a.  + 6.x.  + E. 
1 1 1 1 


and 

Zi  = «2  ^ Vi  • i = i.  2,...,  n . 

This  is  equivalent  to  using  the  slope  estimators  obtained  from  the 
models 
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Y^.  = and  Z.  = “2  ^ 2,...,  n , 

which  is  the  usual  model  with  ~ Consequently,  the 

statistic  does  not  depend  on  the  values  of  the  slope  parameters  6]^ 
and  32* 

Next,  "scale"  property  2.2.5  is  used  to  show  the  distribution  of 
is  free  of  the  scale  parameters  and  . The  statistic 

involves  residuals  of  the  form 


U.  = Y.  - 3,X. 
1 1 1 1 


and 


V^-  = Z.  - 32X.  , i = 1,  2,...,  n 


From  property  2.2.5  with  a = 1/oy,  and  b = 1, 


^X^i  ^^1* ' * * *^n*  ^l*'"’^n^  ” ^i  * * ’^n^'^X’  ^l*"*’^n^  * 


i = 1,  2,  so  that  the  residual  estimates  above  may  be  written  as 


U.  = 


Yi  - (Xj^,...,X^;  Y^^,...,Y^)  (-^) 

A 


X. 


A 


and 


V. 

1 


X. 


Yi  - 32  (Xj^/cTj^,...,X^/a^;  Zj^,...,Z^)  (-^)  , 

X 


which  indicates  that  the  X/s  may  be  replaced  by  their  standardized 
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forms,  without  changing  the  values  of  the  residuals.  Thus, 

is  free  of  the  scale  parameter  of  the  X's. 

From  (2.2.6)  and  the  discussion  immediately  following,  we  can  see 
that  may  be  written  in  terms  of  residual  estimates  of  the  form 


and 


R^.  ~ (Xj^,...,X^;  Ej^,...,E^)  X^.  , 


= Ej  - 02  E{,...,E')  X.  . 


(2.2.7) 


Applying  "scale"  property  2.2.5  with  a=l  and  5=0^,  we  have 


0|^(Xp...,X|^;  0j^(Xp...,X^;  E^,...,E^) 


Simi larly. 


02(Xj^, . . . ,X^,  Ej^/cr^  I , . . . ,E^/(J^  1 ) a. , ’ * ’ *^n*  » 


so  that  replacing  the  error  terms  (E^ ) in  (2.2.7)  by  their 
standardized  forms  E^-Za^  (E^/a^.),  i = 1,  2,  . . . , n,  will  result  in 
transforming  the  residual  estimates  to  R-j/c^^  (Rj/a£').  However,  the 
statistic  which  is  based  on  the  sign  of  the  product  of  the  residual 
differences  is  not  affected  by  such  scaling,  since  both  and  are 
positive  constants,  and  hence  the  distribution  of  is  free  of  these 
scale  parameters. 

Thus  far  in  this  section  we  have  shown  that  if  and  B2  satisfy 
properties  2.2.4  and  2.2.5,  the  distribution  of  the  statistic  Tj^  is 
independent  of  the  regression  constants  used  in  the  linear  models,  the 
location  and  scale  of  the  "covariate  term"  X,  and  of  the  scale 
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parameters  of  the  error  terms.  In  the  remainder  of  this  section,  we 
shall  demonstrate  that  properties  2.2.4  and  2.2.5  are  very  natural, 
and  that  the  three  types  of  slope  estimators  we  have  used  in  this 
study,  namely  the  least  squares  estimator  (OLS),  the  least  absolute 
value  estimator  (LAV),  and  Theil's  slope  estimator,  all  satisfy  these 
properties  under  the  linear  model 


Y^.  = a + 8X^.  + E^.  , i =1,  2,...,  n . 


The  least  square  (OLS)  estimator  of  the  slope  is  given  by 


II  ^ 

I (x^.-x)  (y^.-y) 


I(x  -x)‘ 
i=l  ^ 


so  that 


(i)  ® ( X , . . . , , y 2^'^’Cx  , . . . ,y  |^+CX|^ ) = 


II  ^ 

I (x.-x) (y.+cx.-y-cx) 
i=l  ^ ^ 


n , 

I (x.-x)^ 


i=l 


I (x.-x)(y.-y)  c I (x.-x)^ 
- i=l  ^ . i=l  ’ 


-^2 


I (x,.-x) 
i=l 


-,2 


n 

I (x.-x) 
i=l  ^ 


- ^ (Xj^ , . . . , x^ j ^l****’^n^  c , 


(ii)  S (aXj^ , . . . ,ax^ j byj^, . . . ,by^)  - 


I (ax. -ax) (by. -by) 


i=l 


n 


-,2 


I (ax. -ax) 
i=l  ^ 
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n _ _ 

a b I (x.-x)(y  .-y) 
i=l  ^ ^ 

? (x,-x)^ 
i=l 


b ^ 

~ ^ (^2^ » • • • » » yi*****^^^  * 


provide  a?^0,  and  2.2.4  and  2.2.5  are  satisfied. 

The  least  absolute  value  (LAV)  estimators  of  » and  3,  denoted  by 

A A 

a and  3 respectively,  are  the  values  of  a and  3 which  satisfy 


n n ^ ^ 

min  I I y.-a-6x.  1 = I 1 y.-a-3x.  | . (2.2.8) 

ct.3  i=l  ^ ^ i=l  1 1 


To  see  that  the  LAV  slope  estimator,  3,  satisfies  properties  2.2.4  and 
2.2.5,  note  that 


n n 

min  I I y.-a-3x.  | = I | y.-a-3x.  | 
a,3  i=l  ’ 1 i=l  1 1 


n A A 

= l I y.--«-(3+c-c)x.  I 
i=l  ^ ^ 


n . 

= I I y..+cx.-a-(8+c)x.  I 
i=l  ’ ^ ^ 


so  that  if  y^.  is  replaced  by  y^.+cx^.,  i =1,  2,  ...  , n,  the  new 

A 

Slope  estimate  is  given  by  (3+c),  which  proves  the  "translation" 
property  2.2.4.  Also,  for  a?^0  and  b?^0. 
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n 

min  I I y.-a-8x. 
a. 6 i =1 


n 


y.-a-6x. 


. . ax. 


I by,-ba-  8(aXj)  | 


This  last  expression  indicates  that  when  x^.  is  replaced  by  ax^.,  and  y^. 

is  replaced  by  by^.,  i = 1,2,  . . . , n,  the  new  slope  estimate  is 
• b 

given  by  — g.  Note  also  that  b=0  is  equivalent  to  all  the  y^. 's  equal 
to  zero,  in  which  case  the  LAV  estimates  are  a=0  and  6=0.  This  proves 
the  "scale"  property  2.2.5. 

Theil's  estimate  of  the  slope  (see  Sen,  1968)  is  the  median  of 
the  (2)  slopes  obtained  from  the  (X.j,  Y.,- ) pairs,  i.e., 

- y . -y . 

b(Xj^,...,x^;  = median  { ^ } . 

i<j  ^j'^i 

We  assume  that  all  the  x^'s  are  distinct  because  they  have  a 
continuous  distribution.  It  follows  that 
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^ y .-cx  .-y.-cx. 

(i)  e(x,,...,x  ; y,+cx,,...,y  +CX  ) = median  { ^ -} 

n n Xj  X. 


= median  { 
i<j 


c } 


and 


by^ 


(i1)  e(ax,,...,ax  ; bx, bx„)  “median  ( ,J  ,,  I 

n i i<j  axj-ax. 


_ b ",  . \ 

^ 3 ( , . . . , x^ , y2^»»»*»yp)  t 


and  therefore  this  estimator  also  satisfies  the  properties  2.2.4  and 

2.2.5. 


2.3  The  Null  Hypothesis  Distribution  of 
The  test  statistic  T^  given  by 


Tn  Sgn{[(Y-Y.)-Si(X.-X.)][(Z.-Z.)-62(X.-X.)J}  (2.3.1) 


1 


(2.3.2) 


would  be  a U-statistic  of  degree  2,  except  for  the  presence  of  the 
terms  gj^  and  g2.  The  symmetric  kernel  of  this  U-statistic  with  these 
two  auxiliary  estimators  in  the  kernel  is 
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Sgn{[(E^-E2)-(3i-Bi) (X^-X2)][(Ej-E‘)-( 02-32) . 

where  S.  = (X. ,E.,eI),  i = 1,  2.  Because  this  U-statistic  involves 

1 I 1 

the  estimated  parameters  3i  and  02*  ordinary  U-statistic  theorems 
(see,  for  example,  Randles  and  Wolfe,  1979)  cannot  be  used  to  develop 
its  large  sample  distributional  properties.  In  what  follows,  we  will 
use  equal -in-distribution  arguments  to  show  that  under  Hq  when  the 
distribution  of  at  least  one  of  the  error  terms,  say  E,  is  symmetric 
about  zero,  is  symmetric  about  its  mean  zero.  In  addition,  we 
shall  derive  an  expression  for  Var[T^],  and  discuss  the  null 
asymptotic  distribution  of  using  a theorem  by  Randles  (1982). 

Since  the  distribution  of  is  free  of  the  parameters  a^,  02.  01 
and  02,  with  no  loss  of  generality  we  assume  each  of  them  to  be  zero 
in  the  following  discussion.  The  statistic  may  be  written  as 


\ Sgn  {[(E,-Ej)-8i(X,-X.)K(E:-E^)-62(X,-Xj)]1  , (2.3.3) 


A . A 


Where  0j^  (02)  is  a function  of  the  X.j's  and  the  E.j's  (E.j's).  Let 


= [(E,-E.)  - 5^(X,-X.)] 
= C(E:-E')  - ;^(X,-X.)] 


and 
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and  write 


T = 
n 


I 

i<j 


Sgn 


{ q..q:.  } 

^ ij  ij  ^ 


Now  suppose  the  distribution  of  the  E-'s  is  symmetric  about  zero,  that 
d 

is  E^.  = -E^. , i = 1,  2,  . . . , n.  From  the  "scale"  property  2.2.5,  we 
note  that 


(^1 , . . . , Xn j ~E2,...,“E^)  = • • • » » Ej^,***,  • 

Using  the  independence  of  the  E^-'s  and  their  independence  from  the 

I 

E^-  's  and  X^-  's,  we  have 


(XpE^,Ej^,...,X^,E|^,E^)  - (X^,-E^,E^,...,X^,-E^,E^)  . 


Computing  on  both  sides  of  the  equal  in  distribution  sign  yields 


and,  therefore,  under  and  the  assumption  of  the  symmetry  of  one  set 
of  error  terms,  is  symmetric  about  its  mean  of  zero.  Note  that 
when  the  assumption  of  symmetry  is  dropped  ECT^]  is,  in  general. 
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different  from  zero  even  under  Hq,  and  will  be  given  by 


E[TJ  = P { Qi2Ql2>0  } - P { J 

= 2P  { } + 2P  { Q^2<0»  ‘ ^ • (2.3.4) 

The  expression  for  the  null  variance  of  is  rather  complex 
since  the  distribution  of  depends  on  the  underlying  distributions 
of  the  X^'s,  E^-'s  and  E^. 's,  and  the  type  of  slope  estimators  and  §2 
used  to  generate  the  residuals.  This,  however,  causes  no  limitation 
to  the  applicability  of  our  results  for  large  sample  sizes  as  we  shall 
demonstrate  later,  since  the  limiting  null  variance  is  free  of  the 
underlying  distributions  and  the  kind  of  slope  estimators  used.  For 
the  sake  of  completeness,  however,  we  will  include  the  general  form 
for  the  null  variance  of  T^^.  We  write 


Where 


h(S^.,S.;  e) 


= Sgn  {C(E.-Ej)-3^(X.-Xj)j[(E*-  E‘ )-S2(X.-X^)]} 


= Sgn  { Q.  .g:  . } 
^ '■  ij  ij  ‘ 


and  3 = (3j^.B2)  . Let  e denote  the  mean  of  T^  given  in  (2.3.4). 
That  is. 


0 = 2P{Q^2^0'  + 2P{Q^2<0»  ^ * 
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Then, 


VarCT^]  = E[{-^  I [h(S.,S.;  e)-e]}^] 


There  are  three  types  of  terms  in  the  above  expression: 

Type  0,  where  the  two  kernels  involve  no  subscripts  in  common. 
There  are 


such  terms. 

Type  1,  are  terms  with  one  subscript  in  common.  There  are 


such  terms. 

Type  2 terms  have  two  subscripts  in  common.  There  are  (2)  of 
them. 

Denoting  the  expectations  of  such  terms  by  cq,  and  ?2» 
respectively,  we  have 


(2)(q)("2^)  = (^)L^-2)(n-3) 


) = 2(")(n-2) 


(n-2)(n-3) 

2 


?0  + 2(n-2)  + ^2]  . 


(2.3.5) 
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where 

A A Q 

Cy  = ELh{S^,S2;  6),  h(S3,S^i  B)J  - 0 

/Nrf 

= E[Sgn  {Qi2Qi2l  Sgn  (034534}]  ' 

= P{Qj2Q;2>0.034‘134^°1  * '’(0l2<li2<O.Q34034-^“l 

032032^0»Q34Q34^0}  “ 9 

= 2P(Qj2Q|2>0.034034>0}  ♦ 2P(Qj2gj2<0,Q34q34<0} -1-9^  . 


= E[h(S^,S2;  B)  h(S^,S3;  B )j  - 0' 


rsj 


= 2P{Q^2'^|2^0»Ql3Qi3>0l  + 2P{Qi,Q;5<O,Qt^Q;,<O}-1-0^  , 


and 


?2  = E[n(s^.S2;  e)  h(s^.S2;  b)]  - 0' 


A/  Arf 


A/  Arf 


= 1 - 0 , if  n > 2. 


(2.3.6) 


(2.3.7) 


(2.3.8) 


We  have  shown  earlier  that  when  the  error  terms  of  one  of  the 
underlying  linear  models,  the  E's  say,  are  symmetrically  distributed 
about  zero,  then  0=0  under  Hq.  However,  this  does  not  significantly 
simplify  the  expressions  for  Cq  and  since  to  evaluate  these 
expressions  one  needs  to  know  the  distribution  of  the  covariate  term, 
X,  and  the  joint  distributions  of  variables  of  the  form  {Qij.Qk]}  and 

I I 

iQijp'^ki}-  We  shall  demonstrate  this  by  calculating  the  null 
hypothesis  value  of  Var[T^]  in  the  special  case  when  E^.  (E^!),  i = 1, 

> n,  have  the  standard  normal  distribution,  and  when  Bj^  and 

A 

B2  are  the  ordinary  least  squares  estimators  of  B^^  and 
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respectively.  Under  the  symmetry  of  the  error  terms,  9=0  and 
conditional  on  X = x 


[(E^-E2)-6j^(x^-X2),  (E^-E^l-gj^lx^-x^)] 

d A A 

= [-{E^-E2)+6^(Xj^-X2),  -(E2-E^)■^^ ^^(x^-x^)]  , 


since  by  property  2.2.5  of 


» X , ~E 


1’ 


. . , ~E  ^ ) —3  ^ ( X , 


A similar  statement  can  be  made  for  the  terms  involving  E*.  Taking 
expectations  with  respect  to  X,  and  using  the  above  arguments  and  the 
null  hypothesis  of  the  independence  of  E^.  and  El,  the  expression  for 
?Q  given  in  (2.3.6)  simplifies  to 


= Ej^{8PCQ^2>0.Q34>0]P[Qi2>0.Q34>011 

" M X = x}  . (2.3.9) 


Similarly, 

= Ej^{8P[Q^2>0»^13^0HPWi2>0*Qi3>0H 


+8P[Q^2^0»Ql3<0JPLQi2^0»Ql3<0l]  ‘ M X = x}  , 


(2.3.10) 
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where  denotes  expectation  with  respect  to  the  vector  of  covariates, 

X . To  obtain  an  expression  for  Var[T  ],  we  need  the  following  lemma 
~ n 

which  we  will  state  without  proof: 


Lemma  2.3.11 

Suppose  (J)  ~ BVM  ^(q),  , 

then,  P [X_^0  , Y^O]  Sin  ^(p ) , 

and  consequently,  P [X  _>  0,  Y < 0]  = Sin~^(p  ) 
(see  for  example,  Cramer,  1966,  p.  290). 

The  least  squares  estimator  3^  is  given  by 


I (X.-X)E  I (X  -X)E 
B = = i=l  ^ ^ 

y (x.-x)^ 


where 


" - 2 
Sxx  = I (X.-X)'^  , 


so  that,  for  example. 
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P { I X = X } 

/N/ 


(x,-Xp)  n 

= P{[(E,-EJ ^ I (x,-x)E,]>0, 

^ ^ ^xx  i=l  ^ ^ 


[(E1-E3)- 


(X1-X3) 


n 

I (x.-x)E.]>0} 
1=1  ^ ^ 


where 


(x.-x.)  n 
^ ^ ^xx  k=l 


This  probability  statement  involves  linear  combinations  of  i.i.d. 
standard  normal  random  variables.  The  combinations  are  also  zero  mean 
normal  variables,  so  that  by  lemma  2.3.11 


P{qi2>0.qi3>0}  = j + ^ Sin'^[P (x,12,13)]  , 


where 


P(x,12,13)  = 


Cov(q32,qj^3) 


{Var(q^2^  Var(q^3)}  ^^2 


But, 
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Cov  (c^2^2»Q  ^ — 1 


(Xj^-x^)  (Xj^-x)  (Xj^-x^)  (x2"x) 


XX 


XX 


( ^ ~^2 ) ( x^  “X ) ( x^  ”X2 ) { X2~x ) ( X ~X2 ) ( x^  “X^ ) 


S S 

XX  XX 


XX 


(x^-X2)(x^-X2) 

= 1 - ^ , 
XX 


Var(qj^2^  = 1 + 1 + ^ 


(Xj^-X2)  2(x^-X2)(x^-x)  2(Xj^-X2)  (x2“x; 


XX 


XX 


XX 


- 2 -(x^-X2)  , 


and 


Var{qj^2)  = 2 - (Xi~x^)  /$„„  . 


1 "3'  '"xx 


These  yield 


p(x,12,13)  = 


S^^-tXj-XjXXi-Xj) 


(2 


To  evaluate  P {q22^0»d34^0 }>  we  need  the  quantity 


p(x,12,34)  = 


Cov(q^2»‘l34) 


{Var(q^2^  Var(q_^4)} 


1/2 


which  may  similarly  be  calculated  to  be 


p(x,12,34)  = 


-(Xj^-X2){x2-x^) 


.3.12) 


(2.3.13) 
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The  probabilities  associated  with  q..  involve  the  saiue  quantities 


d . 


ij 


given  above,  since  E^.  = E^. , i = 1,  2,  . . . , n.  From  the  expressions 
for  and  q given  in  (2.3.9)  and  (2.3.10)  we  obtain. 


^0  = 


j-  27Sin’^[p(x,12,34)]}^  - 1 I X = x ] 


= 4Ex[{Sin’^[p(x,12,34)]}2  | X = x]/tt^  , 


and 


q = 4ExL{Sin"^[p(x,12,13)]}2  | X = x]//  . 


From  (2.3.5),  we  get 


VarCTJ  = ^ * 2(n-2)ci  + 1] 


(2) 


where 


^ =^E 
^ 2 “^X 


jsin’^  - 


(X^-X2)(X3-X4) 


{[2Sxx-(Xi-X2)  ^L2Sj^^-(X3“X^) 


and 


IT  ~ L' 


jsin"^  - 


\>c^h-h'^^h-h^ 


([2Sxx-(Xi-X2)2][2S)(x-(Xj-X3)2]j‘/2 


with  E^  indicating  expectation  with  respect  to  the  random  vector  X. 
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2.4  The  Asymptotic  Null  jistribution  of 
The  asymptotic  normality  of  T^  under  is  a direct  result  of  a 
theorem  by  Randles  (1982),  which  gives  the  asymptotic  normality  of  a 
U-statistic  which  involves  an  estimated  parameter.  To  verify  the 
conditions  (given  below)  of  Randles'  theorem,  we  need  the  following 
assumptions; 

I 

2.4.1  E (E  ) is  a continuous  random  variable  with  a bounded  and 
continuous  density  function,  has  median  zero  and  a finite 
variance. 

2.4.2  The  covariate  term,  X,  has  a finite  fourth  moment. 

Consider  the  U-statistic 


— Sgn{[(Ei-£j)-(y^-8i)(Xi-X.)J[(E:-E')-(,2-82)(X.-X.)]} 


(2.4.3) 


Where  the  mathematical  variable  y=  (Yj^,Y2)'  replaces  the  estimator 

A A A 

6=  (62*82)'.  The  corresponding  kernel  is 

= Sgn{[(E2-E2)-(Y2-e2)(X2-X2)][(E{-E')-(Y2-e2)(X2-X2)j} 

(2.4.4) 

with  S^.  = (X^.  ,E^.  ,E^I ) ' . The  Conditions  of  Randles'  theorems 
follows: 


are  as 
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Condition  2.4.5 


- 6)  = 0(1)  . 
rsf  n 


Condition  2.4.6 

Suppose  there  is  a neighborhood  of  S,  say  K(6),  and  a constant 


K,>0  such  that  if  YeK(8)  and  0(Y,d)  is  a sphere  centered  at  y with 
radius  d satisfying  0(Y,d)Cir K( 3),  then 


£[  Sup  I h(S^,^;  y')  - h(SpS2;  y)  1 ] £ K^d  . 

y'  e0(  Y.d) 


Condition  2.4.7 

Suppose  there  exists  a constant  > 0 such  that 
lh(x.,x^;  y)  - h(x.,X5;  3)|  < M, 

for  every  x,,  x,  and  for  all  y in  some  neighborhood  of  3. 

Condition  2.4.8 

9(y)  has  a zero  differential  at  y=S,  where 
e(Y)  = E[T  (y)J  = E [h(S,.S.;  y)J  . 

11'^  '^/X 

Condition  2.4.9 

^/2  d „ 

n [T^(3)  - 9(3)]  > N(0,  t'^) 

where  = 4 Var{E[h(^,^;  £)1S^]}. 
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THEOREM  2. 4. IQ 

Under  assumptions  2.4.1  and  2.4.2, 

n [T„(3)  - 0(s)]  > H(0,t  ) , as  n-H»  . 

Proof.  This  is  seen  by  verifying  conditions  2.4. 5-2.4. 9 given  above. 
Condition  2.4.5 

We  need  to  show 


- 3)  = 0(1)  . 

~ ~ P 

For  this  condition  to  hold  in  general  one  needs  a stronger  assumption 

A 

than  the  consistency  of  the  estimator  3. 

For  example,  from  the  Markov  inequality,  and  for  i = 1,  2, 

P { n'  1 3^.-3^.  I > e } 1 nE[(  6--3^. ) ]/ e . 


so  that  it  is  sufficient  that  the  second  moment  of  3-j  around  3-j  be  of 
order  n”'^,  6 2.  However,  in  our  particular  setting,  when  3^  is  the 
slope  estimator  of  3^-  in  the  simple  linear  model,  we  can  show  that  for 
i = 1,  2,  n^^^( 3^--3^- ) converges  to  some  bona  fide  distribution, 

thereby  proving  this  condition.  In  what  follows  we  shall  demonstrate 
that,  under  very  broad  assumptions,  this  indeed  is  the  case  for  the 
two  estimators  of  interest:  (i)  the  OLS  estimator  and  (ii)  the  LAV 

estimator  s of  3. 
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{ i ) The  QLS  estimator: 
For  the  model 


Y^.  = a + gX^.  + E^.  , i 


8-g  = 


I (X  -X)(E  -£) 
1=1  ^ ^ 

" - 2 
I (X.-X)^ 

i=l  ' 


1 " 

^ I X.E.  - 

= li=L_Ll_ 

- I - 

" 1=1  1 


X E 


"v^ 


= g(C) 


where 

C, 


and  (Cj^,C2,C2,C^) . Using  properties  of  sample  moments 
example,  Serfling,  1980,  p.  125),  we  see  that 


C is  AN  [E(0  , - D , 

~ ~ n X' 

where  I is  the  covariance  matrix  of  (X^,  E^,  X^,  X^E^).  B 
independence  of  the  X.'s  and  E.'s,  g(E(CJ)  = 0,  so  that  by 


(see  for 


the 

corollary 
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3.3  of  Serfling 


(3-6)  = g(C)  is  AN  (0  i D'  0 ) , 


rs* 


where 


0 = 


( 11 

' 3C 


1 


C=E(C) 


11 

9C, 


C=E(C) 


Therefore, 


n^'^^(3-3) 


d 

N (0,  o'  0),  as  n > " . 


Hence,  since  admits  a finite  fourth  moment,  and  the  error  terms 
have  finite  variance,  n^/^(6-3)  converges  in  law  to  a bona  fide 
distribution  implying  n^''^^(3-3)  = 0p(l). 

( i i ) The  LAV  estimator: 

Consider  the  linear  model  given  in  (i)  above,  and  let  3*  be  the 
LAV  estimator  of  3 = (a, 3)  , i.e.,  3*  = (cx*,3*)’  is  a solution  to 


n 

min  { I 
3eR^ 


Yi-ai-3Xi 


} . 


Let  H(.)  denote  the  absolutely  continuous  distribution  function 

of  E^.  with  median  zero  and  continuous  and  positive  density  h(*)  at 

zero.  Let  denote  the  nx2  regression  matrix  which  depends  on  n 

through  the  sequence  of  constants  x.,  x«,  . . . , x . Bassett  and 

I c n 

Koenker  (1978)  have  shown  that  if  Q = is  positive  definite, 

nx» 
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1/2  * 

then  n (6  -3)  converges  in  distribution  to  a bivariate  normal  vector 

2 -1  -1 

with  mean  0 and  covariance  matrix  w Q , where  w = [2h(0)3  . The 

* 1/2  * 

above  result  implies  that  for  the  slope  estimator  8 , n (8  -3) 
converges  in  distribution  to  a normal  random  variable  with  mean  zero 

2 2 I “i  I 

and  variance  ^ = w ^'q  ^ , where  ^ = [0,1]'.  Letting 


g (x  ,E  ) = n^''^(3*-3)/v 
n ~n  ~n 


and 


9 


for  every  sequence  of  regression  constants  {x^}  for  which  Q exists  and 
is  positive  definite,  we  have 


lim  F (t)  = $(t) 
n-  " 

where  '*’(•)  denotes  the  distribution  function  of  the  standard  normal 
random  variable.  In  the  case  where  X2,  . . . , is  a sequence 
of  random  variables  defined  on  a probability  space  P and  having  mean 

p 

zero  and  variance  , 


1 a.s. 

T Q = 

n n n 


1 0 
,0  i 


as  n 


-►  00 


where 


x' X = 
n n n 


n X? 
i=i " 


with 
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n X. 

X = I — 

n n 


It  follows  that 


and 


Q =-^ 


L 


0 

1 


2,  2 

v=wXQ  X=w/aj^, 


/V  />/ 


SO  that  if  we  let 

^.x  = P 1 I } . 

by  Basset  and  Koenker's  (1978)  result  we  have 

lim  „ (t)  = $(t)  a.e.  in  X . 

n , A ~n 

n^  ~n 


But 


= I 


X 


F„  . (t)  dP 
n,x 
~n 


= / {I 


X 

~n 


‘IP 


where  Ij-.j  is  an  indicator  function.  Then  by  the  Lebesgue  Dominated 
Convergence  theorem 


1’™  '■  1 9n<in-^n>i‘  t 


nx» 


= I 


X 

~n 


lim 

n-H» 


Pn  X It) 
"■in 


dP  = $(t) 
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and  therefore 


1/2  * ^ 

n aj^{0  -s)/w  -»■  N(0,1)  as  nx». 

Condition  2.4.6 

To  verify  condition  2.4.6  for  the  kernel  of  this  setting,  we 
examine  the  following: 

Let 


S*  = Sup  I h(S^.S2;  y')  - h(S^.S2;  y)  | 

Y eD(y*<^) 

Sup  {I  Sgn{[(£^-E2)-(Yi-Si)(Xi-X2)J[(£{-£')-(Y2-62)U^-X2)J} 

Y eD(Y>^) 

~ ~ -Sgn{[(£^-£2)-(Yi-3i){X^-X2)J[(£|-£2)-(Y2-62)(X^-X2)J}|}. 

Denoting  B.(|)  = (5^--6^-)(X^-X2),  i = 1,  2,  we  have 

f 

2 if  C(£^-£2)-B^(Y')][(Ei-E2)-32^T')]  > 0 (<0) 

and 

[(£^-£2)-B^(y)J[(£|-£‘)-B2(y)]  < 0 (>0) 

S*  = J 

1 if  [(£i-£2)-Bi(y')JL{£|-£2)-B2(y‘)J  =0  (?^0) 

and 

[(£^-£2)-B^(y)][(£;-£2)-B2(y)]  ^ 0 (=0) 

^0  otherwise  . 

When  taking  expectations,  only  the  value  of  S*  = 2 contributes  to  the 
expected  value,  since  for  S*  = 1 the  expectation  involves 
probabilities  of  continuous  variables  taking  on  zero  values.  Hence, 
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ECS*]  = 2P{L(E^-E2)-B^(Y')][{£i-E')-32(Y')J  > 0 . 

[(E^-E2)-B^(Y)][(Ei-E2)-B2{Y)]  < O} 

+ 2P{[(E^-E2)-B^(y')J[(E|-E2)-B2(y')]  < 0 , 
C(E^-E2)-B^(y)][(E'-E2)-B2(y)]  > 0} 

= 2P{(E^-E2)-B^(y')  > 0.(E‘-E')-B2(y')  > 0, (E^-E2)-B^( y)  < 0. 

(E|-E‘)-B2(y)  > 0} 

+2P{(E^-E2)-B^(y‘)  > 0,(Ej-E‘)-B2(Y‘)  > 0. (E^-E2)-B^{ y)  > 0, 

(E{-E2)-B2(y)  < 0} 

+2P{(E^-E2)-B^(y‘)  < 0,(E|-E2)-B2(y‘)  < 0. (E^-E2)-B^{ y)  > 0, 

(E{-E2)-B2(y)  < 0} 

+2P{(E^-E2)-B^(y')  < 0.(E;-E2)-B2(y‘)  < 0, (E^-E2)-B^( y)  < 0, 

(E{-E2)-B2(y)  > 0} 

+2P{(E^-E2)-B^{y')  > 0.(E{-E‘)-B2(y‘)  < 0, (E^-E2)-B^( y)  > 0. 


(Ej-Ep-B2(Y)  > 0} 
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+2P{(E^-E2)-B^(y')  > 0.(E|-E‘)-B2(y‘)  < 0*  ^ 0* 

(E‘-E‘)-B2(y)  < 0} 

+2P{(E^-E2)-B^(y‘)  < 0,(E|-E2)-B2(y‘)  > 0. (E^-E2)-B^( y)  > 0. 

(E{-E2)-B2(y)  > 0} 

+2P{(E^-E2)-B^(y')  < 0.(Ej-E2)-B2(Y‘)  > 0, (E^-E2)-B^(y)  < 0. 

(£j-Ep-B2(Y)  < 0}  . 

Denote  the  above  probabilities  by  P2»  p^,  . . . , Pg,  so  that 
E[S*J  = 2(pj^+P2+Pg+P4+Pg+Pg+P7+Pg),  and  note  that  for  > X2  , 


Pi  IP  {(E^-E2)-B^(y')>0,  (E^-E2)-B^(yX0  } 

= P {(E^-E2)-(y|-S^)(X^-X2)>0,  (E^-E2)-(y^-S^)(X^-X2)<0  } 


= P { 


E^-E2 


> Yi-S,, 


E1-E2 


X^-X2  '1  ^1’  X~-X2 


Yi  - 3i  } . 


Similarly, 


Pslf  1 


^1-4  . 


- 6, 


^1-^2  , , 


- 8,  1 


E -E  E ' -E ' 

By  assumptions  2.4.1  and  2.4.2  the  random  variable  has 

a distribution  function  K(-)[K'(-)],  and  a density  k(*)Ck'(*)]  which 
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is  bounded  by  B[b']  and  continuous,  so  that 


E.-E, 


^^3  i.  2P{min(Yj-e^,Y^-8^)  < ^ < max(Y|-S^  Y]^-6^)  } 


= 2K[max(Y^-B;^,Yi“6j^)]  - 2K[min(Y{-8^,Y]^-Bl)] 

= 2 I maxiYj-B^.Y^-Sj^)  - min(Y|-B3^, Yj^-8^)  | k(5*) 

= 2 I max(Y^.Y]^)  - min(YpYi)  I 
= 2 d B , 

* I I 

where  c = 6[max{ Yj^, Yj^)  “ min(Yj^. Y]^)]  for  \s\  <1,  and 
since  y'  e 0 (Y.d). 

/v»  ^ 

Similarly  ^ 2^^ a,  — 

E[S*]  _^8d(B+B‘),  which  proves  condition  2.4.6  with  = 8(B+b'). 
Condition  2.4.7 

By  the  definition  of  the  kernel  h,  this  condition  holds  with 

= 2. 


Condition  2.4.8 

We  need  to  show  that 

0(y)  has  a zero  differential  at  y = S. 
where  0(y)  = E[T  (y)J  = E[h(S,,S,;  y)J. 


r>j 
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Using  the  notation  adopted  under  condition  2.4.6,  and 
conditioning  on  and  X2, 

0(y)  = y)|X^  = \ = x^] 

= E[Sgn{C(£^-E2)-b^(Y)][(E^-E‘)-b2(Y)]}] 

= P { (£^-£2)  > b^(Y),  (E‘-£2)  > b2(Y)  } 

+P  { (£^-£2)  < b^(Y),  < b2(y)  } 

-P  { (£^-£2)  > b^(Y).  (E^-E')  < b2(Y)  } 

-P  { (E]^-£2)  < bj^(Y),  (E^-E^)  > b2(y)  } 

= 2P  { (£^-£2)  > b^(Y),  (Ej-E')  > b2(Y)  } 

+2P  { (£^-£2)  < b^(Y),  (E^-E')  < b2(Y)  } - 1 . 

Under  the  null  hypothesis  of  the  independence  of  E^-  and  £^-,  i = 1,  2, 
. . . , n,  the  above  probabilities  factor  to  yield 

£[h{Si,S2;  y)|Xj^  = X2  - X2] 

= 2[l-F^(b^(Y))][l-F2(b2(Y))]  + 2F^(b^( y) )F2(b2(Y) ) ' 1]. 

Thus 
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where 

Bi(£)  = (5.-3.)(X^-X2)  , i = 1.  2, 


and  Ey  ^ denotes  a two-fold  integral  yielding  the  expectation  with 
respect  to  X2.  Using  assumptions  2.4.1  and  2.4.2,  differentiation 
with  respect  to  Y may  be  passed  inside  the  integral  (see 
for  example  theorem  A. 2.4  of  Randles  and  Wolfe,  1979),  yielding 
the  differential  of  the  function  9(Y)  to  be 


+23B^(Y)f^(B^(Y))F2(B2(Y)) 

+23B2(Y)f2(B2(Y))F^(B^(Y))} 

= Ej^^^^^{29B^(Y)f^(B^(Y))[2F2(B2(Y))-l] 
+29B2(Y)f2(B2(Y))[2F^(B^(Y))-l]} 

=0  at  Y = 3 , since  B.(3)  =0, 

i = 1»  2,  and  Fj^(O)  = F2(0)  = 1/2.  This  proves  condition  2.4.8. 

Condition  2.4.9 

We  need  to  show  that 

nl«[T^(6)  - 0(6)]  i N (0,t2)  . 
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This  is  a direct  consequence  of  U-statistics  theorems,  since  under 
V£'  Sgn((£,  -EjHe;  -Ep) 

is  a U-statistic  based  on  the  i.i.d.  random  variables  (E]^,e|),  . . . , 

. I 

(See  for  example  Theorem  3.3.13,  Randles  and  Wolfe,  1979.) 
Further,  under  H^, 


9(8)  = E[h(S.,Sp;  e)] 

= 2P{E^-E2>0,  E^-E2>0}  + 2P{E^-E2<0,  ^ 

= 2P{E^-E2>0}-P{E;-E‘>  0}  + 2P{E^-E2<0}-P{E;-e‘<0}  - 1 
= 0 . 

= 4Var  {E[h(S,,S„;  0)|S.  = s } 

= 4 E3  {(E[h(S^,S2;  S)|S^])^}  . 


since 


{E[h(S.,S„;  6)1$,]}  = ECh(S 


Syl  S)]  = 

»N# 


0 . 


'^■'Eh  “ s = (t,u,v),  and  by  procedures  similar  to  those  given  in 

verifying  condition  2.4.8, 
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E[h(S^.$2;  £)|S^  = sj  = 2P{(u-E2)  > 0,  (v-E^)  > O} 

+ 2P{(u-E2)  < 0,  (v-E‘)  < 0}  - i 
= 2H^(u)H2(v)  + 2[1-H^(u)][1-H2(v)]  - 1, 

where  H2(‘)  (H2(*))  is  the  distribution  function  of  E(e'). 

= 4 / / { 4Hf(u)Hp(v)  + 4[1-H,(u)]^[l-H,(v)]^ 

0 0 ^ ^ 

+1+8  H^(u)H2(v)[l-H^(u)j[l-H2(v)]  - 4H^(u)H2(v) 
- 4[1-H^(u)][1-H2(v)]}  dH^(u)dH2(v)  . 

The  above  expression  contains  four  types  of  terms: 


1 1 


(i)  If  Hf(u)H^(v)dH.(u)dH.(v)  = 4-H^(u) 


0 0 


1 1 


(ii)  / / [1-H,  (u)]‘^[l-H5(v)j'^dH,  (u)dH,(v)  = 
0 0 ^ ^ 


= |-  [l-H^(u)]^ 


[1-H2(v)]' 


9 


11 

(iii)  II  [H. (u)-Hf(u)j[H.(v)-H5(v)]dH, (u)dH„(v) 

oo  -*-  ^ ^ 

1 


= [ jH^(u)  - i-Hj(u)J 


1 u3, 


[ 4h^(v)  - iH^{v)]  I 
0 ^ ^ ^ ^ lo 


Therefore, 


36  ’ 


and 
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(iv)  //  Hj^{u)H2(v)dHj^(u)dH2(v)  = -^  H^{u)H2(v) 


1 


i 


0 0 


0 


* T 


0 0 


1 1 

/ I [l-H^{Li)JLl-H2{v)]dH^(u)dH2(v)  . 


Therefore 


= 4[4(i)  + 4(|)  + 1 + - 4(i)  - 4(|)J  = I . 


Conditions  2. 4. 5-2. 4. 9 are  satisfied  so  that  by  Randies'  theorem 
(1982),  under  Hq 


2.5  The  Simulated  Null  Distribution  of  Under  Normality 
The  tables  in  this  section  contain  the  empirical  null 
distributions  of  T^^  obtained  by  a Monte  Carlo  simulation  study.  This 
and  all  other  studies  in  subsequent  chapters  were  performed  on  the 
University  of  Florida  IBM-3033  using  Fortran.  A copy  of  some  main 
programs  and  subroutines  used  in  this  work  is  given  in  the  appendix. 

In  generating  the  distribution  of  T^  under  the  hypothesis  of  the 
independence  of  E and  £',  we  bear  in  mind  that  the  distribution  of  T^ 
is  free  of  the  regression  constants  involved  in  the  underlying  linear 
models  (2.1.1),  and  of  the  location  and  scale  parameters  of  X,  E and 


51 


E',  as  discussed  in  section  2.2.  The  simulated  distribution  of  is 
then  obtained  as  follows:  the  IMSL  subroutine  GGNML  is  used  to 

generate  3n  i.i.d.  random  variables  from  the  standard  normal 
distribution.  These  are  then  divided  into  three  groups  of  size  n each 
to  yield  X^,  and  E^,  i = 1,  2,  . . . , n,  and  the  following  models 
are  obtained 

Y.  = X.  + E. 

1 1 1 

and 

= X^.  + E^.  , i=l,2, . . . ,n  . 

From  these  models  we  obtain  residual  pairs  in  two  ways:  (i)  by  the 

ordinary  least  squares  (OLS)  procedures,  and  (ii)  by  the  least 
absolute  value  (LAV)  method.  The  LAV  estimates  of  the  regression 
parameters  were  obtained  by  an  algorithm  given  by  Josvanger  and 
Sposito  (1983).  This  algorithm  is  reproduced  in  the  appendix.  In 
each  of  the  two  cases  (the  OLS  and  the  LAV),  the  usual  Kendall's  tau 
was  calculated  on  the  residuals.  This  process  was  repeated  10,000 
times,  and  the  frequency  distributions  for  the  different  possible 
values  of  the  statistic  were  recorded.  The  empirical  relative 
frequency  distributions  of  T^  for  the  two  cases  are  given  in  Tables 
2.1  and  2.2,  respectively. 
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Table  2.1 

The  Null  Distribution  of  T^  {OLS  fit) 

For  a given  n,  the  entry  in  the  table  for  the  point  x is  a,  the 
empirical  estimate  of  a=Pot(2^T^>x],  where  T^  is  obtained  from  the 
residuals  of  an  OLS  fit. 


n 


X 

6 

7 

10 

11 

14 

15 

13 

19 

1 

.4991 

.5000 

.5003 

.5012 

.4998 

.4890 

.5043 

.5034 

3 

.3773 

.3992 

.4361 

.4438 

.4603 

.4569 

.4739 

.4752 

5 

.2667 

.3049 

.3715 

.3908 

.4197 

.4229 

.4422 

.4497 

7 

.1702 

.2174 

.3100 

.3364 

.3803 

.3871 

.4143 

.4250 

9 

.0968 

.1451 

.2537 

.2854 

.3428 

.3504 

.3374 

.3979 

11 

.0478 

.0877 

.2022 

.2378 

.3073 

.3157 

.3581 

.3719 

13 

.0187 

.0492 

.1628 

.1927 

.2721 

.2831 

.3311 

.3460 

15 

.0040 

.0237 

.1221 

.1544 

.2400 

.2534 

.3066 

.3173 

17 

.0103 

.0901 

.1228 

.2055 

.2249 

.2810 

.2918 

19 

.0033 

.0631 

.0956 

.1768 

.1978 

.2588 

.2725 

21 

.0006 

.0444 

.0746 

.1493 

.1697 

.2367 

.2533 

23 

.0294 

.0544 

.1264 

.1477 

.2164 

.2322 

25 

.0195 

.0391 

.1055 

.1264 

.1958 

.2128 

27 

.0128 

.0265 

.0889 

.1086 

.1757 

.1925 

29 

.0083 

.0161 

.0713 

.0908 

.1572 

.1738 

31 

.0041 

.0110 

.0578 

.0749 

.1375 

.1563 

33 

.0023 

.0066 

.0456 

.0611 

.1233 

.1401 

35 

.0012 

.0047 

.0362 

.0510 

.1101 

.1257 

37 

.0028 

.0287 

.0409 

.0981 

.1123 

39 

.0012 

.0229 

.0336 

.0874 

.0994 

41 

.0007 

.0158 

.0251 

.0756 

.0863 

43 

.0004 

.0125 

.0202 

.0665 

.0754 

45 

.0086 

.0160 

.0565 

.0657 

47 

.0056 

.0122 

.0476 

.0579 

49 

.0045 

.0087 

.0404 

.0500 

51 

.0031 

.0071 

.0326 

.0423 

53 

.0020 

.0053 

.0267 

.0363 

55 

.0009 

.0038 

.0226 

.0309 

57 

.0005 

.0026 

.0182 

.0272 

59 

.0003 

.0019 

.0152 

.0231 

61 

.0017 

.0121 

.0193 

63 

.0012 

.0092 

.0167 

53 


Table  2.1-contimjed. 


n 


X 6 7 10  11  14  15  18  19 


65 

.0008  .0071 

.0135 

67 

.0052 

.0109 

69 

.0042 

.0093 

71 

.0029 

.0082 

73 

.0021 

.0064 

75 

.0016 

.0054 

77 

.0013 

.0044 

79 

.0011 

.0040 

81 

.0010 

.0029 

83 

.0008 

.0023 

85 

.0007 

.0021 

87 

.0018 

89 

.0011 

91 

.0008 

93 

.0007 

X 

4 

5 

8 

9 

0 

.5867 

.5785 

.5482 

.5434 

2 

.4089 

.4299 

.4620 

.4648 

4 

.2503 

.2809 

.3703 

.3922 

6 

.1056 

.1630 

.2907 

.3190 

8 

.0000 

.0751 

.2241 

.2550 

10 

.0229 

.1596 

.1965 

12 

.1091 

.1460 

14 

.0701 

.1063 

16 

.0420 

.0717 

18 

.0239 

.0468 

20 

.123 

.0275 

22 

.0045 

.0157 

24 

.0018 

.0095 

26 

.0008 

.0050 

28  .0022 

30  .0007 

32 

34 

36 


n 


12 

13 

16 

17 

20 

.5329 

.5236 

.5182 

.5114 

.5152 

.4847 

.4787 

.4829 

.4812 

.4932 

.4341 

.4296 

.4492 

.4493 

.4677 

,3827 

.3855 

.4134 

.4185 

.4404 

,3346 

.3441 

.3833 

.3897 

.4203 

,2865 

.3022 

.3494 

.3570 

.3954 

,2437 

.2667 

.3143 

.3280 

.3682 

2069 

.2296 

.2844 

.2961 

.3463 

1742 

.1945 

.2568 

.2724 

.3253 

1413 

.1616 

.2298 

.2482 

.3020 

1138 

.1339 

.2067 

.2241 

.2823 

0884 

.1070 

.1823 

.1999 

.2606 

0652 

.0870 

.1582 

.1787 

.2388 

0504 

.0709 

.1380 

.1597 

.2218 

0395 

.0543 

.1195 

.1419 

.2049 

0293 

.0424 

.1025 

.1244 

.1868 

0202 

.0337 

.0873 

.1083 

.1697 

0142 

.0260 

.0759 

.0949 

.1537 

0097 

.0194 

.0632 

.0803 

.1396 

54 


Fable  2.1-continued. 


X 


38 

40 

42 

44 

46 

48 

50 

52 

54 

56 

58 

60 

62 

64 

66 

68 

70 

72 

74 

76 

78 

80 

82 

84 

86 

88 

90 

92 

94 

96 

98 


n 


12 

13 

16 

17 

20 

,0072 

.0136 

.0518 

.0691 

.1253 

,0048 

.0089 

.0434 

.0599 

.1117 

0027 

.0056 

.0364 

.0511 

.1004 

,0014 

.0044 

.0305 

.0423 

.0907 

0009 

.0028 

.0240 

.0347 

.0789 

0003 

.0018 

.0202 

.0276 

.0701 

.0012 

.0162 

.0226 

. 0625 

.0006 

.0134 

.0182 

.0546 

.0103 

.0152 

.0469 

.0076 

.0120 

.0416 

.0060 

.0094 

.0371 

.0048 

.0076 

.0326 

.0039 

.0059 

.0283 

.0033 

.0043 

.0241 

.0019 

.0033 

.0205 

.0016 

.0025 

.0169 

.0014 

.0023 

.0141 

.0010 

.0016 

.0120 

.0011  .0098 

.0006  .0086 
.0004  .0067 

.0052 
.0038 
.0033 
.0023 
.0019 
.0017 
.0016 
.0011 
.0010 
.0009 
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Table  2.2 

The  Mull  Distribution  of  T^  (LAV  fit) 

For  a given  n,  the  entry  in  the  table  for  the  point  x is  a,  the 
empirical  estimate  of  a=Pg[(2)T^>x],  where  T^  is  obtained  from  the 
residuals  of  an  LAV  fit. 


n 


X 

6 

7 

10 

11 

14 

15 

18 

19 

1 

.5208 

.5038 

.5072 

.5097 

.5041 

.5072 

.5044 

.5015 

3 

.3766 

.3930 

.4349 

.4458 

.4641 

.4672 

.4741 

.4742 

5 

.2499 

.2958 

.3724 

.3903 

.4193 

.4299 

.4439 

.4467 

7 

.1528 

.2081 

.3103 

.3338 

.3735 

.3936 

.4156 

.4212 

9 

.0938 

.1381 

.2506 

.2836 

.3323 

.3579 

.3888 

.3955 

11 

.0559 

.0844 

.2010 

.2323 

.2957 

.3214 

.3589 

.3692 

13 

.0263 

.0493 

.1554 

.1895 

.2633 

.2890 

.3344 

.3460 

15 

.0057 

.0259 

.1163 

.1547 

.2298 

.2595 

.3073 

.3214 

17 

.0000 

.0121 

.0878 

.1176 

.1988 

.2296 

.2816 

.2966 

19 

.0037 

.0612 

.0888 

.1693 

.2021 

.2562 

.2753 

21 

.0009 

.0414 

.0642 

.1441 

.1757 

.2332 

.2522 

23 

.0000 

.0277 

.0465 

.1201 

.1541 

.2120 

.2304 

25 

.0182 

.0323 

.1002 

.1308 

.1921 

.2099 

27 

.0110 

.0230 

.0829 

.1086 

.1707 

.1896 

29 

.0061 

.0159 

.0685 

.0892 

.1510 

.1695 

31 

.0029 

.0095 

.0548 

.0759 

.1357 

.1540 

33 

.0017 

.0052 

.0434 

.0617 

.1224 

.1367 

35 

.0009 

.0036 

.0339 

.0501 

.1082 

.1200 

37 

.0005 

.0020 

.0246 

.0416 

.0944 

.1069 

39 

.0004 

.0014 

.0180 

.0326 

.0810 

.0955 

41 

.0005 

.0128 

.0260 

.0697 

.0829 

43 

.0004 

.0093 

.0197 

.0593 

.0734 

45 

.0002 

.0065 

.0162 

.0496 

.0648 

47 

.0000 

.0042 

.0127 

.0413 

.0564 

49 

.0028 

.0093 

.0351 

.0495 

51 

.0022 

.0070 

.0284 

.0420 

53 

.0017 

.0055 

.0242 

.0353 

55 

.0011 

.0036 

.0194 

.0289 

57 

.0008 

.0030 

.0154 

.0246 

59 

.0005 

.0021 

.0129 

.0201 

61 

.0004 

.0017 

.0094 

.0172 

63 

.0003 

.0011 

.0077 

.0142 

56 


Table  2.2-continued. 


n 


X 6 7 10  11  14  15  18  19 


65 

.0007 

.0059 

.0122 

67 

.0001 

.0046 

.0107 

69 

.0001 

.0034 

.0091 

71 

.0024 

.0076 

73 

.0019 

.0062 

75 

.0015 

.0053 

77 

.0012 

.0045 

79 

.0008 

.0035 

81 

.0007 

.0029 

83 

.0007 

.0019 

85 

.0003 

.0015 

87 

.0001 

.0010 

89 

.0000 

.0008 

91 

.0006 

93 

.0004 

95 

.0003 

97 

.0003 

99 

.0003 

n 


X 4 5 8 9 12  13  16  17  20 


0 

.5652 

.5887 

.5417 

.5362 

2 

.3976 

.4258 

.4511 

.4620 

4 

.2805 

.2801 

.3604 

.3911 

6 

.1269 

.1617 

.2823 

.3161 

8 

.0000 

.0882 

.2106 

.2493 

10 

.0312 

.1440 

.1928 

12 

.0000 

.0971 

.1421 

14 

.0607 

.1037 

16 

.0360 

.0700 

18 

.0211 

.0455 

20 

.0112 

.0295 

22 

.0061 

.0185 

24 

.0029 

.0099 

26 

.0007 

.0052 

28 

.0000 

.0031 

30 

.0016 

.5267 

.5245 

.5089 

.5157 

.5170 

.4776 

.4769 

.4768 

.4829 

.4935 

.4236 

.4329 

.4416 

.4509 

.4674 

.3754 

.3868 

.4117 

.4236 

.4407 

.3211 

.3394 

.3758 

.3909 

.4161 

.2746 

.2991 

.3459 

.3588 

.3916 

.2372 

.2600 

.3134 

.3285 

.3687 

.1984 

.2246 

.2827 

.2990 

.3455 

.1628 

.1916 

.2555 

.2708 

.3230 

.1339 

.1599 

.2276 

.2458 

.2997 

.1106 

.1333 

.2029 

.2219 

.2795 

.0893 

.1060 

.1812 

.1995 

.2618 

.0686 

.0863 

.1554 

.1798 

.2427 

.0514 

.0691 

.1354 

.1590 

.2225 

.0373 

.0539 

.1173 

.1418 

.2020 

.0259 

.0421 

.1001 

.1246 

.1832 

57 


Table  2.2-continued. 


X 


32 

34 

36 

38 

40 

42 

44 

46 

48 

50 

52 

54 

56 

58 

60 

62 

64 

66 

68 

70 

72 
74 
76 

73 
80 
82 
84 
86 
38 
90 
92 
94 
96 
98 

100 

102 

104 

106 

108 


9 


.0008 

.0003 


n 


12 

13 

16 

17 

20 

.0182 

.0327 

.0855 

.1097 

.1671 

.0129 

.0236 

.0702 

.0967 

.1490 

.0083 

.0177 

.0581 

.0838 

.1365 

.0055 

.0139 

.0489 

.0737 

.1241 

.0034 

.0103 

.0403 

.0631 

.1114 

.0018 

.0073 

.0332 

.0543 

.1002 

.0013 

.0051 

.0270 

.0453 

.0869 

.0006 

.0036 

.0206 

.0385 

.0760 

.0003 

.0023 

.0169 

.0312 

.0674 

.0001 

.0017 

.0129 

.0256 

.0589 

.0010 

.0104 

.0217 

.0516 

.0005 

.0083 

.0167 

.0437 

.0004 

.0061 

.0126 

.0387 

.0003 

.0043 

.0093 

.0349 

.0038 

.0070 

.0308 

.0026 

.0052 

.0278 

.0024 

.0039 

.0245 

.0017 

.0023 

.0212 

.0014 

.0023 

.0173 

.0010 

.0019 

.0144 

.0005 

.0013 

.0113 

.0003 

.0012 

.0092 

.0007 

.0073 

.0006 

.0059 

.0004 

.0046 

.0003 

.0035 

.0029 

.0021 

.0021 

.0017 

.0015 

.0014 

.0011 

.0010 

.0010 

.0009 

.0005 

.0005 

.0002 


CHAPTER  THREE 

THE  ASYMPTOTIC  EFFICIENCY  OF  T.  RELATIVE  TO  THE 
PEARSON  PARTIAL  CORRELATION  COEFFICIENT 

3.1  Introduction 

When  investigating  the  performance  of  statistical  tests  for 
independence,  the  researcher  is  faced  with  the  crucial  problem  of 
specifying  an  appropriate  class  of  alternatives  which  is  (i) 
sufficiently  wide  to  encompass  a large  variety  of  situations,  and  (ii) 
is  mathematically  manageable.  In  our  setting,  this  problem  is  further 
complicated  by  the  presence  of  the  slope  estimators  which  induce 
dependence  among  the  residual  pairs  (L)^.,  ),  i = 1,  2,  . . . , n.  To 

attain  maximum  generality  and  at  the  same  time  keep  our  investigation 
mathematically  manageable,  we  adopt  the  "trivariate  reduction"  model 
for  the  errors.  This  is  the  model  recommended  by  Hajek  and  Sidak 
(1967)  for  parametrizing  the  class  of  alternatives  to  the  hypothesis 
of  independence.  Similar  models  were  also  considered  by  Konijn  (1956) 
and  Shirahata  (1977). 

The  class  of  alternatives  is  constructed  as  follows: 

let  E.  = W. . + AW„. 

1 ll  3i 

and 

^ * ^^3i  ’ 1=1. 2,..., n. 
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where  {W2-j}  and  i = i,  2,  . . . , n are  three 

independent  random  samples  of  continuous  random  variables.  The 
hypothesis  that  E^-  and  are  independent  is  equivalent  to  the 
hypothesis  that  A = 0,  so  that  the  test  is  equivalently  given  by 

Hg  : A = 0 versus  : A 0 . 

To  study  the  Pitman  asymptotic  relative  efficiency  (ARE),  we  will 

further  suppose  that  A^  is  a sequence  of  parameters  converging  to 

the  null  hypothesis  value,  i.e.,  limA  = 0 . 

n 

In  section  3.2,  we  give  a main  result  which  ensures  the 
asymptotic  normality  of  a U-statistic  with  an  estimated  parameter 
under  a sequence  of  alternatives  converging  to  the  null  hypothesis. 

In  section  3.3,  we  shall  apply  the  results  of  section  3.2  to  obtain 
the  asymptotic  normality  of  T^,  and  in  section  3.4  we  derive  the 
asymptotic  distribution  of  the  partial  correlation  coefficient,  R^. 
Section  3.5  contains  the  applications  of  a theorem  by  Noether,  by 
which  an  expression  for  the  asymptotic  efficiency  of  T^  relative  to  R^, 
is  obtained.  A table  of  ARE's  calculated  for  several  underlying 
distributions  is  given  at  the  end  of  section  3.5. 

3.2  The  Asymptotic  Normality  of  a U-statistic  with  an 
Estimated  Parameter  Under  a Sequence  of  Alternatives 

The  main  result  in  this  section  is  an  extension  of  a theorem  by 
Randles  (1982)  which  involves  a generalization  of  a result  given  by 
Sukhatme  (1958).  Randles'  theorem  is  slightly  modified  to  apply  to 
the  more  general  case  where  the  U-statistic,  U , and  its  moments  are 
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functions  of  the  sample  size,  n,  through  the  observations  Xi.„,  Xo,_, 
. . . , whose  distribution  in  turn  depends  on  n perhaps  through 

a sequence  of  parameters  A^. 

Let  Xj^.^,  X2.f^,  . . . , X^.^  denote  a random  sample  from  some 
distribution  with  distribution  function  FpCx),  possibly  changing  as  n 
changes,  and  let  h(xj^,  . . . , x^;  y)  denote  a symmetric  kernel  of 
degree  r with  expected  value 


r:n’ 


Y)] 


where  8 denotes  a P-dimensional  parameter  value,  and  y is,  in 
general,  a mathematical  variable.  Both  the  kernel  and  its  expected 
value  may  depend  on  y,  and  on  n through  X,  , . . . , X , . The 
corresponding  U-statistic  is  then 


U (Y) 
n ~ 


= 4-  I 

aeA 


(") 

r 


:n’ 

1 r 


(3.2.1) 


where  A denotes  the  collection  of  all  subsets  of  size  r from  the  set 
of  integers  {l,  2,  . . . , n}.  The  main  result  of  this  section  gives 
the  asymptotic  normality  of 

- e^(s)]  , 
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where  3 is  an  estimator  of  the  parameter  3.  The  key  step  in  the 
proof  of  the  main  result  requires  that 


.1/2, 


n-'-[U^(3)  - 0„(3)  - U^(3)  + 0„(6)]  0 , as  n-^ 

n~  n~  n~  n~ 


(3.2.2) 


The  proof  of  (3.2.2)  is  given  in  theorem  3.2.8,  but  first  we  prove  a 
lemma  and  list  the  conditions  needed  for  the  proof  of  3.2.2. 

Lemma  3.2.3 

Let  X,  , , . . . , X be  i.i.d.  random  variables  whose 

• n • n m • n 

distribution  may  depend  on  n.  Suppose  • • • » 

satisifies 


(i) 


for  every  n , and 


(ii)  X^.„))2]=0. 

n-x» 

then 


^n  ' ,n.  ^n^~a.:n’* *’*’  la  :n 

(^)  aeA  1 r 

r ~ 


P 

-»•  0,  as 


where  A is  as  defined  earlier. 


Proof.  Write 


E[U^] 


VarCU^] 


— I (")("■") 

(H)  c r-c 


c,n 
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where 


^c.n  " ~r:n^‘^n^~l;n’*"*~c:n’~r+l:n’ 


,X 


2r-c:n 


)]. 


(see,  for  example,  Randles  and  Wolfe,  1979,  p.  65).  Also,  it  has  been 
shown  that,  for  fixed  n. 


r.  fOr  C = 1,2, ...,r  , 

C f 1 1 I j ll 


with 


V.n  = ^ ' 


Now,  define  by 


K = (rl)^ 

c![(r-c)!]^  ’ 

so  that  each  term  in  the  above  sum  involves 

|,  (n-r)(n-r-l)  ...  (n-2r+c+l) 

■ n(n-l)  ...  (n-r+1)  **0,0 

. u (n-r)(n-r-l)  ...  (n-2r+c+l) 

c * n(n-l)  ...  (n-r+1)  ^r,n  * 

Note  that  the  numerator  involves  (r-c)  factors  of  n,  whereas  the 
denominator  involves  r such  factors,  so  that  for  each  c = 1,  2, 

_ (5 

. . . , r,  the  coefficient  of  ^ is  0(n  ),  6>i,  and  therefore,  from 
(ii),  each  term  in  the  sum  goes  to  zero,  as  n goes  to  infinity.  It 
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follows  that,  as  n approaches  infinity,  E[U^]  goes  to  zero,  and, 
therefore,  converges  in  probability  to  zero. 

Condition  3.2.4  Suppose 


V2 

n (e  - 6)  = 0^(1)  as  n>®  . 

~ ~ p 


Condition  3.2.5  Suppose  there  is  a neighborhood  of  3,  say  K(s), 
and  a positive  constant  such  that  if  yeK(3)  and  0(y,d)  is  a sphere 
centered  at  y with  radius  d satisfying  0(Y,d)CII  K(s),  then,  for  every 


E[  Sup 
Y'£D(T.d) 


h(X 


l:n’ 


..,X 


r:n 


)-h(X 


l:n’*“’ 


V:n-  i '^1“ 


(3.2.6) 


and 


lim  £[  Sup  |h(X.  ,...,X  ; Y')-h(X.  ,...,X  • y)|^J  = 0 

d-^0  Y'eD(Y.d)  ~ ^ ~ 


(3.2.7) 


uniformly  in  n.  That  is,  for  every  e‘>0  and  every  n,  there  exists  a 
positive  constant  D*  such  that  for  0<d<D'  and  D(Y,d)  K(s), 


E[  Sup 
Y'eD(Y,d) 
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THEOREM  3.2.8 


Under  conditions  3.2.4  and  3.2.5, 


n (6)  - 9 (6)  - U (6)  + 9 (3)]  0 , as  n-*-®  . 

j]  ^ ri'^ 


PROOF.  Let 


» t)  “ H(x,,...,  X , y)  ~ 9 (y) 

iix  f/w  X 


SO  that 


“n>X> 


W =4-  I ChnlX X,  y)J  , 


,n > ‘■"n '"a,  :n ' 

aeA  1 

r ~ 


a :n’  ~ 
r 


where  A denotes  the  collection  of  all  subsets  of  r integers  from 
{l,  2,  . . . , n}.  Then, 


1/2 

n [U  (0)  - 0 (0)  - U (0)  + 0 (0)] 


V2 


(;■)  aeA 


V2 


-I/2 


?TaL  = = ■ 'n(\:n  — \:n*  • 
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''  * 1/2  '' 

Denote  the  above  expression  by  where  s = n (£-£).  Mow  take  £ 

and  to  be  arbitrary  constants.  By  Condition  3.2.4,  n^^^(6-6)=o  (1) 

~ ~ p 

so  that  we  can  find  a sphere  C in  centered  at  the  origin,  such  that 


P[n  (£  ■ £)  ^ C]  < Y • every  n 


(3.2.9) 


Then, 

Vz 

P[|Q„(n  (6  - £))|  > ej 


V2  ^ V2  '' 

= P[|Q„(n  (8  - 8))|  > e , n (6  - 8)  e C] 

f|  rwr  ' fs*  rsf 

V2  - V2  « 

+ PLlQ„(n  (S  - 8))|  > e , n (8  - 8)  ^ C] 


V2  - V2  - V2  - 

< P[lQ^(n  (S  - 6))|  > e , n {£  - 6)  e CJ  + P[n  (£  “ £)  ^ C] 


V2  - 

< P[Sup  I Q (s)  I > e]  + P[n  (8  - 8)  ^ C]  . 
■ sec  " ~ ~ ~ 


It  suffices  to  show  that 


PCSup  I Q (s)  I > e]  ->■  0 as  n>“  , 
sec  " 


where  e and  C are  fixed. 

Let  C|^,  u = 1,  Z,  . . . , U denote  a finite  collection  of  open 

spheres  centered  at  s..  with  radii  iiC  11  < for  every 

~u  u - 8K1 


66 


u = 1,  2,  . . . , U,  such  that  (J  C ID  C.  Now, 

u ^ 


V? 


V? 


\ 


n ^ •n****'^a  •n****’^rt  •n’^^'^ 

( '■ ) n . n a^ . n ~ ~ n . n a^ . n ~ ~u 


Vz 


-1/2 


+ ^^^7 I .......  X ,„; 

(^)  aeA  ^ ~ ~u  n Oj^in’  ’ a^:n’ 


3)] 


^ ^n.u^E^  ^ Qn.o^lu^  • 


Also  note  that 


U 

P[Sup  I Q (s)  I > e]  < \ PCSup  1 Q (s)  | > e]  , 

SeC  " ~ ■ u=l  seC„  " ~ 

Li 


since 


{Sup  1 Q (s)  1 > e}  =>  {Sup  I Q„(s)  | > e}  for 

SeC  " ~ seC  " ~ 

^ u 


some  u - 1,  2,  . . . , U.  It  suffices  then  to  show  that  each  term  in 
the  above  finite  sum  converges  to  zero,  as  n-»-<».  But, 
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PCSup  i Q (s)  I > e]  = P[Sup  I (s)  + (s  ) | > e] 

II  /s/  ^ r'  iljU^  iIjO 


i I Q„_,U)  I >f]*P[|  Q„_„(s^)  I >f]  . 


We  shall  next  apply  Lemma  3.2.3  to  show  that  each  of  the  probabilities 
on  the  right  hand  side  of  the  above  inequality  converges  to  zero. 

First  consider 


^n,o^£u^ 


1 ^^2 

I n [ff  (X  . .., 
(")  aeA  " “r" 

I /s» 


-1/2 


..X  8+n  s„) 

. n ^ Ui 


X ; b)J 

>'  a]^*n  Oy.*''  ~ 


Applying  lemma  3.2.3  with 

V2  - V2 

l<n(*)  = n g+n  s^^)  - . . . ,X^.^;  g)j  , 


shows  Q „(s  ) + 0,  provided  we  can  show  that 

n ) u u 


-1/2 

lim  Et{^n^^l;n’“-’V:n’  £u^  " ^n^^l:n’ * ‘-’^rm’  £^1^-*  " ^ 


n-H*> 


But 
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{h^(Xi;n.....X^:n;  s+n  sj  - • * * * V:n* 

- 1/2  - V2 

“ V;n-  £*"  £u'  ' 

- £*  * 

- I/2 

{h{Xi:n,...,X^:n;  S+n  - h(X^.^,...,X^.^;  £) 

■ 2 

- [0„(6+n  s„)  - e (3)]}^ 

II  ^>^14  n 

.1/2 

< 2[ti(X^.^,...,X^.^;  s+n  - ^(X^.^, . . . ,X^.^ ; 3)]^ 

-1/2 

+ 2[0^(e+n  s^)  - 9^(s)]^ 


Taking  expectations,  we  have 


- V? 


^i^^^n^^lin’-'-’Vin’  £u^  " ^n^^l:n’ ’ • * ’^r:n’ 


-I/2 

< 2E{[h(X^.|^,...,X^.^;  3+n  s^)  - n(X^.^, . . . ,X^.^;  s)J^} 

-I/2 

+ 2[0  (s+n  s ) - 0_(s)j^ 

* • 1 1 


< "^^n>i(X^.^,....X^.^;  s+n  - ^i(X^.^, . . . ,X^.^ ; 3)  i ] 
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- V2 

< 4E[Sup  lh(X^  ; g+n  s) 

C r.n  ~ 

Z ''u 


— V:n‘  ‘ ^ 


which  goes  to  zero,  as  n goes  to  infinity  by  (3.2.7)  of  condition 
3.2.5.  Here  we  use  the  fact  that 


■ ^2  - V2  - V2  2eU 

IIS  + n s - Sii  = Bn  sii  < 2n  UiiCuii  < - ^ 


/s/  ^ 


8K^n  ^^2 


Next  we  examine 


Sup  IQ  (s)| 

SeC..  ~ 


V2 


Sup 

seC 
~ u 


n 


-1/2 


n ’ I tj* 

(")  aeA  ^ “r"  “r‘"  ~ 


-I/2 


n ou  • n M • n ~ ~U 


< I Sup  n 

r)  aeA  seC 
r ~ ~ u 


V2 


-V2 


- 1/2 

■ ^n^^a^:n’'-*’^c^:n’  !u^ 


"4“  I 

( ) aeA  SsC 

I ^ u 


fT  (X 


-1/2 


n : ir 


•.X  ; 8+n  s) 

Ct^  • 1 1 ^ 

r 


-1/2 


“ .|-»*»*»X  .1  3^n  s ) 

n • n c(^  • n 


- E[Sup 
SeC 


u 


h (X 
n : n 


■ V2 


\ 

+ n E {Sup 
SeC. 


- 1/2 


- V2 

■ ^cy:n’  lu^ 


= D.  + 
In 


2n 


Now, 


D 


2n 


V2 

n E[Sup 


fT{X 


a^:n’ 


- (^(X 


«1 


:n 


» • • • » 


I A 
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V2 

n E[Sup 
seC 


h(X 


-I/2 


a^:n* ' 


V2 


.X  ;S+n  s ) 
a . n ~ ~u 
r 


- 1/2  - V2 


- 0 (8+n  s)  + 0„(3+n  s.,) 


V2 

n E[Sup 
sec. 


h(X 


-I/2 


-1. 


■/2 


V2 


+ n Sup 

seC 
~ u 


- 1/2  - 1/2 


Q^(8+n  s) 


iul 


V2 

n E[Sup 
sec. 


- 1/2 

;n’****^a 

i r i 


- V2 


,X  ._;3+n  s ) 
a n ~ ~u 
r 


V2 


+ n Sup 
see 


-I/2 


_l/2 


;n***”^a 

i r 1 r 


V2 


< 2n  E[Sup  h(X 

sec,  ' “r"’ 

~ u 


-I/2 


-1/2 


•*^a  •n’£'*’"  s)-h(X  ,...,X  ._;8+n  s ) 

.ri'^  cii.n  ct.ri'''  mj 

r 1 r 


1/2  - 1/2 

< 2n  K,  nc,»  n = 2K,  IIC  « < -r  by 

iU  iU“4 


(3.2.6)  of  Condition  3.2.5,  the  definition  of  Cjj,  and 
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+ n 


-1/2 


S - 0 


- I/2 

n s II 


- \ - V2 

n Us  - s II  < n iiC  n . 
~ ~u  u 


Next  consider  and  apply  lemma  3.2.3  with 


K (.) 
n 


V2 

= n {[Sup 
seC. 


hn(Xi:n...-,X^;n;£+n  s)-h^(X^.^,...,X^.^;£+n  s^^) 


- E[Sup 
SeC 


u 


n l:n 


• • • 


.I/2 


^ n 1 • n 


r:n 


;8+n 


Now, 


1 

n 


E[fl(  (X, 

I n l:n 


E {Sup  |^n(X^.^,...,X^.^;8+n  £)-f>n(Xi.n..-..X^.^;8+n  s^^) 


E[Sup 

SeC 


"n‘Xl:n' 


• »X  i0+n 

(•II  ^ 


-I/2 


-I/2 


£u^ 


< E [Sup 
ScC 


~ “ V2  - V2 

^^^l:n***-*V:n’£''"  £^'^n^^l;n’*  * * * V:n»S^'’  £u^ 
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< 2E  [Sup 
sec. 


- V? 


h(Xi:n....,X^:^;3+n  ~u^ 


+ 2 Sup 
sec 


■ \ 

( ^+n  s ) 

n ~ ~ 


\ 


9 (^+n  s,,) 

n ~ ~u 


< 4E  [Sup 

see 
~ u 


h(Xi;n.. 


- 1/p 


-1/2 


■*•  0 , as  n ^ 3 by  (3.2.7)  of  Condition  3.2.5  and  since 


n ^2  II Q II  -►  0 , as  n " . 


Thus  far,  we  have  shown  that  under  the  Conditions  3.2.4  and  3.2.5 


Vz 


n [U^(8)  - 0^(6)  - U (3)  + 0 (S)]  > 0 . as  n>»  . 

n~  n~  n~  n~ 


The  main  result  of  this  section  is  given  in  the  next  theorem 
which  yields  the  limiting  distribution  of 


n [U  (3)  - 0 (8)J  . 
11'^  n ^ 


THEOREM  3.2.10 

Suppose  that  ®_(t)  is  uniformly  (in  n)  differentiable  at  y = 3 

I * /W 

and  that  this  differential  is  zero.  Suppose  further  that  the 
conditions  of  Theorem  3.2.8  are  satisfied.  If,  in  addition. 
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1/2  d 2 

n [d„(S)  - 9_(3)]  N(0,t^)  , as  , (3.2.11) 

\]  ^ 11'*- 


2 

with  T > 0 , then 

1/2  - d 

n [U  (S)  - 0 (6)]  ^ N(0,t^)  . 
n ~ n ~ 


PROOF.  Note  that 


1/2 

n [U  (6)  - 0 (3)] 


1/2 

= n [U  (S)  - 0_(3)  - U (0)  + 0„(3)] 

ii^  ii'^  M'-  ri'- 


1/2 

+ n [U  (8)  - 0 (0)  - 0 (0)  + 0 (0)] 

ri'-  ]\  ^ M'--  H'- 


1/2  1/2 

= n [U  (3)  - 0 (8)J  + n [0  (3)  - 0 (3)]  + 0 (1)  . 
n~  n~  n~  n~  p 


since  by  Theorem  3.2.3 

1/2^^  P 

n [U  (0)  - 0 (0)  - U^(0)  + 0(0)]  > 0 . as  n-K-  . 
n~  n~  n~  n~ 


Then,  by  Slutsky's  theorem, 

I/2  ^ I/2 

n Cd  (0)  - 0 (0)]  and  n [0^(0)  - 0„(0)] 
n~  n~  n~  n~ 


have  the  same  limiting  distribution,  provided  that  we  can  show  that 
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n [9^(8)  - 9^(8)]  = 0^(1)  . (3.2.12) 

n ~ n ~ p 

We  show  below  that  this  follows  from  the  fact  that  uniformly 

differentiable  at  y = S,  and  that  this  differential  is  zero  at  y = 

fs»  /V/ 

8.  By  definition  (see,  for  example,  Serfling,  1980,  p.  45),  0-(y)  is 

rsd  ^ 

uniformly  (in  n)  differentiable  at  Y = 3 if  for  every  n.  (9©„)/(3y,- ), 
i = 1,  2,  . . . , p,  all  exist  and  if,  in  addition,  the  differential 
function 


39 
II 

3Yi 


satisfies  the  property  that  for  every  e>0,  there  exists  a neighborhood 
W-(8)  of  8 and  an  WJ  such  that  for  every  yeN^(8)  ana  for  n>d* 


3n^^)  - ■ My. -3.) 

n ~ n ~ 11 


39 

n_ 


< e llY  - 611 


Y = 3 


Now  since  9^^  admits  a zero  differential  at  y = 3,  and  since  it  is 
uniformly  (in  n)  differentiable  at  y = 3 we  have  that  for  every  e>0 
there  exists  N (8),  a neighborhood  of  8,  and  N*  such  that 


9„(y)  - 9^(8) 
n ~ n ~ 


< e lly  - 811  , 

“ /s# 


Whenever  yeN  (3)  and  n>N*. 


It  follows  that  for  s in  N (s),  and  n>N* 

~ e ~ e 
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n 


9_(6)  - 0 (6) 
n n 


\ - 

en  116  - 011  . 

/N^  r\j 


Also,  by  Condition  3.2.4, 


1/2 

n (6  - 8)  = 0^(1) 


which  implies  that 


Vz 

n 116  - 611  = 0 (1) 

~ ~ D 


since 


Vz 


n 


V2  P ^ 2 

n [ I (6.  - 
i=l  ^ ^ 


Vz 


p 1/2 

< 1 I n (6-6)1, 
i=l  ’ ^ 


and  since  a finite  sum  of  Op(D  variables  is  Op{l).  By  (3.2 
know  that  for  every  6>0  there  exists  Mg>0  such  that 

Vz 

P {n  116  - 811  > M^}  <6  , 


for  every  n.  Now,  to  show  (3.2.12)  we  need  to  show  that  for 
e*>0  and  every  6*>0,  there  exists  an  N such  that 


(3.2.13) 


(3.2.14) 


.14),  we 

(3.2.15) 

every 
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Vz 


P {n  “ I e^(£)  - 0^(£)  I > £*}  < 6 , 


whenever  n > N. 

Take  6 = 6*/2  and  let  e = where  is  defined  by  (3.2.15). 

By  (3.2.13)  we  know  that  there  exists  a neighborhood  of  £ with  radius 

★ ic  ^ 

dg,  and  there  exists  an  such  that  n>Ng  and  »8  - £“<dg  imply 

V2  - e*  V2  - 

n I 0„(^)  - 0„(0)  I < T-n  113  - 011  . 

Choose  so  that  n>N]^  implies 

P {iis  - gu  > d J < I-  . 

~ ~ £ 2 


(Note  that  the  choice  of  such  an  n is  possible  since 
1/2  - 

n ii£  “311=  Op(l)).  Combining  the  above  observations  we  see  that 

for  every  e*>0,  every  3*>0  and  for  n>N  = max(N*,N,), 

^ 1 


V2 

P in  I 9„(8)  - 0^(8)  I > C*1 


= P in  I e^(6)  - e^(8)  I > €*  , 118  - 81  > 


V? 


+ P {n  I e^(£)  - 9^(£)  I > £*  , U£  - 311  < dg,} 
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* 1/2  ^ 

< 4-  + P {n  1 e (6)  - 0 (3)  I > e*  , 116  - 311  < d I 

c n~  n~  ^ - z 


\ - 


5*  * 

< o—  + P {4-  n IIS  - Sil  > £*} 

-2  Mg 


6*  ^^2 

^ t P { n 16  - 61  > M5  } 


f.'k 

0 6 * 
<-2-  + -2“=  <5  . 


V2 

which  implies  that  n 


REMARK  3.2.16 

Although  the  above  results  deal  with  a random  sample  of 
observations  from  a univariate  distribution,  they  remain  valid  when 
~l-n’  * * * * ~n*n  multivariate  population. 

REMARK  3.2.17 

A difficult  step  in  applying  theorem  3.2.10  is  verifying  (3.2.7) 
of  Condition  3.2.5.  However,  if  one  can  show  that  there  exists  an 
M]^>0  such  that 


h(X,  ,...,X 
l:n  r:n 


V:n=  I 


(3.2.13) 


for  all  Y in  some  neighborhood  of  S,  and  every  X,  , ....  X 

~ ~ j l:n*  * r:n 

then  (3.2.6)  implies  (3.2.7).  To  see  this  note  that 
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Y‘eD(Y,d)  '^•'^  ~ ~ 


/V# 


E[{  Sup  I ^ • n * * * * *^r*  n*  ^ ^ ~ ^^^1  • n*  * * * *^r*  n * J 

Y eO(Y,d)  ~ ^ ^ ~ 


< M^E[  Sup  |h(Xi  ,...,X  : y‘)  - h(X,  ...,X  ; S)l] 

Y eO(Y,d)  ~ ~ 


^ /w 


< M^K^d 


which  goes  to  zero  as  d+0  by  (3.2.6), 


3.3  The  Asymptotic  jiormality  of  !„  Under  a 
Sequence  of  Alternatives 


A A A 


The  statistic  T^  involves  the  estimator  B = (Bj^.62)'  of 
S = (62^,32)'*  so  that  to  apply  theorem  3.2.10  we  need  first  to  obtain 

V? 

the  asymptotic  normality  of  n CT„(3)  - e„(3)]  under  a sequence  of 
alternatives,  i.e.,  we  need  the  asymptotic  normality  of  the  statistic 
involving  the  parameter  value  S rather  than  its  estimate  3.  To  this 
end,  we  shall  apply  theorem  5.3.10,  and  lemmas  5.3.11  and  5.3.13  of 
Randles  and  Wolfe  (1979). 

Using  the  "tri variate  reduction"  method,  we  may  write 


E.  = W, . + AW3. 
1 ll  3i 


E;  = ^ ^^3i  ’ 1=1.2. ...,n  , 


and 
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so  that  our  underlying  linear  models  are  given  by 

Yi  = + AW^^-  , and 

^i  ^ “2  ®2^i  '"^Zi  ^'^3i  ’ i=1.2,...,n  , 

and  the  hypothesis  of  independence  of  E^-  and  E],  i = 1,  2,  . . . , n, 
is  equivalent  to  the  hypothesis  A = 0.  To  establish  the  results  of 
this  section,  we  need  the  following  assumptions: 

3.3.1  {W2-j}  and  i = 1,  2,  . . . , n,  are  three 

independent  random  samples  of  random  variables  with  absolutely 
continuous  distribution  functions  G]^{.),  G2(.)  and  G3(.), 
respectively. 

2»3.2  The  variables  T|^  = - Wj^2  have  distribution  functions 

F|^(.)  and  bounded  and  continuous  density  functions  f|^(.),  k = 1,  2,  3. 
2.3.3  The  variable  has  a finite  first  moment. 

~l:n*  * • • * £n'n  ^ random  sample  from  some 

trivariate  distribution  with  distribution  function  G^{.,.,.)  depending 
on  n,  where 


X. 

1 


S. 

~i  :n 


W,.  + A W 
li  n 3i 


(3.3.4) 


W„,  + A W,. 
2i  n 3i 


The  symmetric  kernel  of  degree  r=2  is  then  given  by 
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[(W2i"'<22*'^‘'n''^3l''^32*’*^2"*2**’'i'*2*^^ 


where  is  a sequence  of  parameters  depending  on  n with  0,  as 
3 _ is  a fixed  parameter  and  Y = ^ 

mathematical  variable. 

The  kernel  may  be  rewritten  as 


>'<Sl:a-52:.:  V = Sgn 

•LT2+An'’’3'^^2‘^2^^^r^2^^^  ’ 


and  the  corresponding  U-statistic  is 


^n^~^  ■ ^n^~l:n**”’~n:n’ 


= 4-  I 

(3)  i<j 


^^~i:n*~j: 


n’ 


1) 


To  obtain  the  asymptotic  distribution  of  T„(3),  we  first  need  to  find 
its  mean  and  its  limiting  variance,  which  we  shall  do  next.  Note  that 

®„(£)  = E[T„(6)] 


= E[Sgn([Tj.i„T3][T2*n^3]l] 
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= Tjj  > 0} 

- HLTj*AJ33[T2.V3J  < 0) 

= PlT3.i„T3  > 0,  TjtAJj  > Ol 

* PlTj+Vs  ^ “■  ^ “1 

- "tTl^^nTa  > “>  < Ol 

- '“(Tl-Vs  < “•  T2*‘'nT3  > Ol 

= 2P{JM1^  > 0.  T„+A  > o} 

X n 0 w n o 

* 2P|Ti*V3  ^ “■  ^2"V3  ^ 1 

% '2tl-F3(-V3)J[l-F2(-V3»2Fll-^T3)F3(-V3)-lt 


(3.3.5) 


where  Ey  denotes  expectation  with  respect  to  the  random  variable  T-.. 
3 ^ 

2 

The  asymptotic  variance  of  T (8)  is  n = lim  r with 

~ n-Kx. 


and 


^ ^ *-*^^~»~2:n’  ^ ~l;n  "" 


The  limiting  variance,  n,  will  be  obtained  as  a result  of  applying 

theorem  5.3.10  of  Randles  and  Wolfe  (1979),  for  which  we  only  need 

the  quantity  h,  (s).  We  see  that 
X • n 
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h,  (s)  = E [h(s,S2.  : B) 
1 . n ^ n ^ 


-l:n 


£[Sgn{[(W^^-Wj^2^‘^^n^'^3l"^32^^*-^’^2r^22^'*’^n^'‘^3l”^32^^lEl:n" 
= ^•^^9n{[e^-W^2"V32^^®n“'^22‘V32^^^ 

= 2P{e^-Wj^2"V32  ^ ®n"'^22’V32  ^ 

+ 2P{e^-W^2’V32  °*  ®n~'^22‘V32  ^ ^ 


Eu  {2G, (e  -A  W^^)G„(e '-A  W^„) 
W^2  Inn  32  2 n n 32 


. 2[l-G^(e„-Y32)J[l-G2(e‘-A„W32)]  -1} 


(3.3.6) 


where  6j^(.)  [G2(.)]  is  the  distribution  function  of  W^2t'^22^ * 

®n  '^l^^n'^3  ^'^2^^n'^3^  '^1*  '^2  '^3 

given  values  of  and 

Next,  we  verify  the  conditions  of  theorem  5.3.10  of  Randles  and 
Wolfe.  Condition  (i)  is  immediate,  since 

^*-^^^~l:n’~2:n^^  = 1 , for  every  n > 2 . (3.3.7) 

Conditions  (ii)  and  (iii)  hold,  if  the  conditions  of  lemmas  5.3.11  and 
5.3.13  are  satisfied.  Lemma  5.3.11  follows  from  (3.3.7)  with  M = 1. 
There  remains  to  verify  conditions  (i)  - (iv)  of  lemma  5.3.13. 

Condition  (i):  We  need  to  show  that  there  exists  a real  valued 
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function  k(s)  such  that 

lim  hi,„(s)  = k(s)  for  every  x . 

n>oo  ~ ~ ~ 

But  from  (3.3.5),  and  for  every  s = (x,e  ,e‘)‘, 

~ n n 


Tim  h,  „(s) 
M \ m i • n ^ 

n-H» 


= lim  E,.  {2G.(e  -4  W.,.)G„(e‘-A  W.,,) 

n-Hx)  ^22  1 n n 32  2 n n 32 


2Gj^(Wj^)G2(w2)  + 2[1-Gj^{Wj^)][1-G2(w2) J ~ 1 


= k(s)  , 


because  by  the  Lebesgue  Dominated  Convergence  Theorem  the  limit  may  be 
passed  inside  the  expected  value,  since  Gj^(.)  and  G2(.)  are  absolutely 
continuous  distribution  functions,  and 


lim 

n-Ko 


w. 


and 


lim  e!  = 


Condition  (ii):  Let  G (s)  denote  the  distribution  function  of  S. 

n ~ ~i:n 

We  will  show  that  there  exists  a distribution  function  G(s)  such  that 


lim  G (s)  = G(s)  for  every  s , 


35 


but  this  is  immediate  since  -*•  0,  as  n +«>,  and,  therefore. 


^ :n 


X. 

1 


“li  ^ V31 


W,.  + 1 W,. 
2i  n 3i 


converges  in  probability  and  in  law  to  S^. 


X. 

1 


W 


li 

2i 


Here  G(s)  is  the  distribution  of  S . , i = 1,  2,  . . . , n. 

/V»*| 


Condition  (iii):  We  need  to  show  there  exists  an  M*  such  that 

|h,  (s)|  < M*  for  every  x,  and  every  n > 2.  3ut  from  the  definition 

i . n ~ ~ “ 

of  the  kernel  h(.,.),  for  every  s and  every  n > 2 


E [h(s,S.  : 8)  I S,.„  = s] 

~ ~i.n  ~ ~i:n  ~ 


< 1 . 

and  Condition  (iii)  holds  with  any  M*  > 1. 

Condition  (iv):  To  find  E(k^{S)],  where  S is  a random  variable  with 

distribution  function  G(s),  recall  that 


k(s)  = 2G^(w^)G2(w2)  + 2C1-G^(w^)][1-G2(w2)J  - 1 . 
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so  that 

^ 4G^(Wj^)G2(w2)  + 4[1-Gj^(Wj^)]^[1-G2(w2)]^  + 1 , 

+ 8G^(w^)G2(w2)[1-G^{w^)][1-G2(w2)]  - 4G^(w^)G2(w2) 

- 4[1-G^(w^)][1-G2(w2)J  . 

Also  note  that 

E[lc^{S)]  = E[k^(W^^.W2i)^ 

= / / k^(Wj^,W2)  ^ 

= / / k (w^,W2)  d G^(Wj^)  d ^2(1^2)  , 
since  and  X are  independent.  Further, 

/ G^(Wj^)dGj^(Wj^)  = / G2(w2)dG2(w2)  = "J  . 

/ [l-G^(w^)]^dG^(w^)  = / [l-G2(w2)]^dG2(w2)  = j. 

/ G^(w^)[l-G^(w^)]dG^(w^)  = / G2(w2)Cl-G2(w2)]dG2(w2)  = -^  . 
and 

/ G^(w^)dG^(w^)  = / G2(w2)dG2(w2)  = ^ . 


and  therefore 
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E[k  (S)]  1 3^  ~l~l=-^<oo. 

Thus,  the  conditions  of  lemma  5.3.13  are  satisfied,  and  the  limiting 

variance  of  T (s)  is 
n ~ 


n = r^Var[k{S)]  = 4£[k^(S)]  = ^ , 

/V*  /Nrf  J 


since 

E[k(S)]  = //[2G^(w^)G2(w2)+2[l-G^(w^)j[l-G2(w2)j-l]dG^(w^)dG2(w2) 

= 0 . 

Thus  we  have  verified  all  the  conditions  of  Theorem  5.3.10  in  Randles 
and  Wolfe  (1979).  We  have  thus  proved  the  following. 

THEOREM  3.3.8 

Under  conditions  3.3.1  - 3.3.3, 

^/2  cl 

n [T^(b)  - QpU)]  - N(0,j) 

where  0„(s)  is  given  in  (3.3.5). 
n 

A JK 

Let  6 = (3^,62)'  be  an  estimator  of  3 = (3^,32)'.  We  shall 
apply  our  Theorem  3.2.10  to  obtain  the  asymptotic  normality  of 

^2 

n [T„(j)  - e„(£)] 

under  a sequence  of  alternatives  approaching  the  null.  To  that 
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effect,  we  need  first  to  verify  the  conditions  of  Theorem  3.2.8. 
Condition  3.2.4  is  discussed  under  Condition  2.4.5  of  the  previous 
chapter.  Also  by  remark  3.2.17,  step  (3.2.7)  of  Condition  3.2.5  holds 
if  (3.2.6)  and  (3.2.18)  are  satisfied.  But,  from  the  definition  of 
the  kernel  h(.),  (3.2.18)  is  immediate  with  = 2.  There  remains  to 
prove  step  (3.2.6)  of  Condition  3.2.5,  i.e.,  we  need  to  show  that 
there  exists  a constant  K2^>0  such  that  for  every  n. 


Sup  I ^ ^ t)|J  5 • 

I IN/  l\  ^ ^ J. 

Y sO(Y,d) 


The  proof  of  this  step  is  identical  to  that  given  in  verifying 
Condition  2.4.6.,  except  that  here  Kj^(.)  [kj^(.)J  and  K2(.)  [k2(.)] 
denote  the  distribution  functions  (density  functions)  of 


and 


^l“^2 


respectively. 

Thus,  the  conditions  of  theorem  3.2.8  hold,  and  to  apply  theorem 
3.2.10  we  only  need  to  show  that  (i)  for  every  n,  9f,(y)  has  a zero 
differential  at  y = S,  and  that  (ii)  9n(y)  is  uniformly  (in  n) 
differentiable  at  y = s.  Using  the  notation  developed  above,  we  have. 


= ^s^'’<£l:n-£2:n 


; y)J 


= P{[T^+A^T3-(y^-s^)(X^-X2)J[T2+aJ3-(y2-82)(Xi-X2)]  > 0} 

■ ‘'{■^Ti''V3'^Xi-8i)(X^-X2)][T2+A^T3-(y2-82)(Xi-X2)J  < 0} 
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= 2P{T^+A^T3-(Yi-8i)(X^-X2)>0.  T2+A^T3-( Y2-82) (X3-X2)>0} 

+ 2P{T3+aJ3-(y^-6^)(X3-X2)<0,  T2+A^T3-{ Y2-S2)  (X^-X^XO} 
- 1 . 


Conditioning  on  X^  and  T3,  and  using  the  independence  of  and 
X,  we  can  write,  with  5.(y)  = (Y--8.  )(Xi-X„),  i = 1,  2, 

^ 1 'v  1 1 i ^ 


+ 2P[T3<b3(y)-4^t3]P[T3<b3(Y)-i,t3] 


= E 


^ 1 ^1*  ^2  ~ ^2*  ^3  ^3  ^ 

X1.X2.T3  |2[l-Fi(bi(x)-i„t3)][l-F2(b2(T)-i„t3)] 
" 2l^llbl(T)-i„t3)F2(b2(x)-b„t3) 


” 1 I ~ ^2*  X2  ~ X2»  T3  ~ 1 


(3.3.9) 


“ X T 

A3.A2.I3 


{J(X3.X2.T3;  y)}  . 


3y  Conditions  3.3.2  and  3.3.4  we  can  pass  differentiation  with  respect 
to  Y inside  the  expectation  to  obtain 
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3y  Xj^,X2,T2 


3j(x^.X2.r3;  y) 
3^^ 


Differentiating  first  with  respect  to  y^,  we  have 
3J( . ; y) 

= 2{x^-X2)f^(b^(Y)-A^t3)[l-2F2(b2(Y)-A^t3)] 
which,  when  evaluated  at  y = 3,  gives 


Similarly, 


3J(.;  y) 

/V 


2(x^-X2)f2(-A^t3)[l-2F^(-A^t3)]  . 


Each  of  the  above  two  expressions  has  a zero  expectation  with  respect 
to  X,  and  X^,  so  that,  for  every  n,  q Ay)  has  a zero  differential  at 
Y = 3. 

To  show  that  Q (y)  is  uniformly  (in  n)  differentiable  at  y = 3. 
we  need  to  show  that  for  every  e>0  there  exists  N^(3),  a neighborhood 
of  3,  and  ii*  such  that  for  n>N*, 

~ e e’ 

|0.(y)  - 0n(3)|  < e IlY  - 311  . 


To  establish  this,  we  use  the  following  lemma  which  follows 
immediately  by  the  Lebesque  Dominated  Convergence  Theorem. 
LEMMA  3.3.10 
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Let  V denote  some  random  variable  (not  necessarily  independent  of 
and  X2).  If  E[|X^-X2l]<“,  then 

{IX1-X2I  |2F(AV)-1|}  > 0 as  A^O, 

where  F is  an  absolutely  continuous  distribution  function  with  F(0)= 

V2  • 

Now,  from  (3.3.5)  and  (3.3.9) 


9„(£)  = E 


and 


W = Exj.X3.T3  {2Cl-El<^rV3'^tE-E2<‘>2-''nT3>J 

* 2Fj(bj-i„T3)F3(b2-V3)  - U . 


Where  b^.  = (y^.-8^.  )(X^-X2),  i = 1,  2. 
It  follows  that 
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0 (Y)  - 0 (S) 
n ~ n ~ 


= E 


X^.X^.Tj  f2[l-flOrV3)Kl-F2(b2-V3)J 


* 2Fllbi-V3>F2‘'=2-V3> 


2[l-Fi(-i„T3)]Ll-F2(-V3)J 


2F  (-4  T )F  (-4  TJ  . 
1 n 0 2 no 


Subtracting,  then  adding  the  quantities 


and 


2[1-F^(bi-4J3)JU-F3(-4J3)J 


2F  (b,-4  T )F  (-4  T ) 
i 1 n o 2 no 


and  combining  terms,  we  obtain 


9 (Y)  - 0(B) 
n ~ n ~ 


= l2Cf2">2-V3>-'^2‘-‘nT3'JC2F3(b3-4j3 


*2CFj(b3-4^T3)-Fj(-4j3)J[2F3(-4^T3 
Xj.Xj.Tj  1*^11  * ^X^.Xj.Tj  (‘^2!  ■ 


H E 


where  and  are  the  two  terms  in  the  above  expectation. 
Taylor's  expansion,  we  have 


)-lJ 
)-l] 

Using 
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< E 


< E 


{2B2|b2l|2F^(b^-A^T3)-ll} 


< 2B2»Y-6»  E {|Xi-X2ll2F^(b3(Y)-A^T3)-l} 


where 


b,.  -=  b,(y)  = (y,-6,)(Xi-X3)  . 


But,  for  Y close  enough  to  6 and  A^  sufficiently  small,  i.e.,  n 
sufficiently  large.  Lemma  3.3.10  shows  that  we  can  bound 


A similar  bound  exists  for  jEw  . ^ and  the  result 

^1  * ^2  * 3 ^ 

obtains. 

All  of  the  conditions  of  Theorem  3.2.10  have  been  verified,  and 
therefore  we  conclude  the  following. 

THEOREM  3.3.11 

Under  assumptions  3.3.1  - 3.3.3, 


2B2E(|Xj-X2l|2Fj(bj(T)-4j3)-l|} 


by  e/2  so  that 


^/2  . - d 

n LT  (S)  - 0 (e)]  > H(0,1)  , as  n>®. 
ii  n y 
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3.4  The  Asymptotic  Normality  of  Pearson's 
Partial  Correlation  Coefficient 

The  partial  correlation  between  the  variables  Y and  Z with  X held 
constant  is  defined  to  be 


*^YZ.X 


Ryz  - '^YX'^ZX 


2 2 "^2  ’ 
[(1-Ryx)U-R2x)J 


(3.4.1) 


where  Rj^[j  is  the  usual  product  monient  correlation  between  the 
variables  a and  b,  i .e. , 


ab 


•■^aa’^bb^ 


V2 


with 

n 

= I (a.-a)(b.-b)  , 
ab  1 1 

and  (3.4.2) 


S 

aa 


I (a.-a)^. 
i=l  ^ 


Suppose  that  Y and  Z are  both  related  to  X by  the  simple  linear  models 


Y. 

1 


o,  + S 1 X . + E . 
1 1 1 1 


» 


Z. 

1 


“2  * 


1=1.2, 


.n  . 


and 


(3.4.3) 
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A A A A 

Letting  (a^)  and  6^^  (B^)  be  the  OLS  estimators  of  (a^)  and 
(B2),  respectively,  we  obtain  the  following  residuals 


A A 


’‘i 


9 


and 


(3.4.4) 


V. 

1 


‘V“2>  - ‘V  “2) 


i=l,2, . . . ,n. 


It  can  be  shown  that,  under  the  linear  models  (3.4.3),  Ry^  x equal 
to  the  partial  correlation  coefficient  between  E and  E‘  nolding  X 
constant,  i.e.,  i^y^.x  ~ *^EE'.X*  statistic  may  be  written  as 


"^EE'.X  " ‘^YZ.X  * ‘^UV 


*^E£'  ■ ‘^EX*‘^E'X 


[(1-Rex)  U’^E'x^^ 


\ 


Expressing  each  of  the  correlation  coefficients  in  the  above 
expression  in  terms  of  the  appropriate  sums  of  squares  and  cross 
products,  we  have 


96 


R 


XX^EE'  ^XE^XE’ 


(3.4.5) 


EE'.X 


For  efficiency  studies,  we  shall  obtain  the  asymptotic  normality 
of  the  partial  correlation  statistic  under  the  "tri variate  reduction" 
model  proposed  earlier,  i.e.,  when  E and  E'  are  related  by 


where  {W2i},  {W3^-},  i = 1.  2,  . . . , n,  are  three  independent 

random  samples  having  the  same  distribution  as  the  continuous  random 
variables  Wj^,  W2  and  W3  with  distribution  functions  Gj^(.),  G2(.)  and 
G3(.),  respectively.  In  addition  we  need  the  following  assumptions: 


^•4.8  The  variables  Wp  W2  and  W3  have  finite  second  moments. 

3.4.9  The  variable  X has  a finite  fourth  moment. 

Note  that  there  is  no  loss  of  generality  in  assumption  3.4.7.  Since 
the  statistic  R^^'^x  ^ function  of  "translation  invariant"  cross 

products  and  sums  of  squares,  it  is  free  of  the  locations  of  X,  Wp  ^3 
and  W3,  and  hence  no  generality  is  lost  in  the  zero-mean  assumption. 
Also,  R^^'^x  ^X’  variance  of  X,  since  replacing  X^-  by 

^i^*^X  affect  the  value  of  R^^i  so  that  we  may  safely  take 

0^=1.  Denoting  the  variances  of  and  by  a^,  and  a^, 

respectively,  we  see  that 


and 


(3.4.6) 


> • • • > 


n 


3.4.7  The  variables  X,  Wp  W2,  W3  have  zero  means,  and  cr^=l 
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Corr(E.E')  = A2a2/[(a^+A2a2) (a^+A^a^)] 


V2 


(3.4.10) 


Note  again  that  under  the  "tri variate  reduction"  model,  the  test  of 
independence  is  equivalent  to  testing  Hq:  a=0,  where  in  general  we  may 
consider  A to  be  a function  of  n.  We  shall  denote  the  partial 
correlation  coefficient 

by  R^=R^(X,W^,W2,W2»  which  is  the  same  as  the  quantity  R^^, 
with  E and  E'  being  replaced  by  their  corresponding  values  in  terms  of 
4 and  the  W's.  Using  the  same  notation  as  in  (3.4.2),  we  calculate 
the  new  sums  of  squares  and  cross  products  involved  in  the  statistic 
Rf,  to  be 


S,  , = I (£  -E)(t;.-E') 
“■  i=l  ' ' 


= ^ 5W3W3 


and  similarly. 


^EE  = ^W^W^  " 2AS^^^^  . A , 


^E-E'  = ^ ^ ^ » 


^XE  " ^XW^  ^^XW3  ’ 


(3.4.11) 


and 
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^XE'  ^XW2  ^^XW3 


The  asymptotic  normality  of  may  be  obtained  in  one  of  two 
ways.  Viewing  Rp  to  be  the  usual  product  moment  correlation 
coefficient  applied  to  the  residuals,  one  may  think  of  Rp  as  a 
function  of  three  U-statistics,  and  then  use  theorems  such  as  that  of 
Randles  (1982)  or  our  extended  version  Theorem  3.2.8  to  obtain  its 
asymptotic  normality.  The  other  method  is  to  obtain  the  asymptotic 
normality  of  Rp  by  considering  it  to  be  a function  of  several  sample 
moments.  Here,  we  shall  follow  the  second  approach,  since  it  is  more 
straightforward  and  since  it  assumes  finite  moments  up  to  order  4 
rather  than  6,  as  would  be  required  by  the  first  approach.  For  this, 
we  need  to  apply  the  following  theorem  by  Kepner  (1979): 

THEOREM  3.4.12 

Let  Q.  for  i = 1,  2,  . . . , n be  a sequence  of  n i.i.d.  random 

I > 11 

vectors  where 


» • • • * 


I 9 II  /N^ll 


, . . . , n 


and 


where 


. . . , 


Let 


. P 
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and 


Let  S be  a neighborhood  of  u in  and  suppose  that  g:S  R is  a 
function  possessing  continuous  partial  derivatives  of  order  2 at  each 
point  of  S.  If 


V2  d 


then 


V2  d 

n C9(Z  )-g(ii  )J  N(0,d‘ld)  , 

»N/J|  — - — • 


r>j  rw 


where 


d = 


3g(z) 

5z, 


» • • • f 

z = y ~p 


9g(z) 
Sz. 


z = y 


In  our  case  we  shall  let 


*^li  ,n 

= X. 
1 

’ ^2i,n 

= x2 
1 

’ n = 

j 1 j n 

“11  • 

^4i,n 

= 

''li 

’ ^5i,n 

= W2i 

’ ^6i,n 

4i  ■ 

^7i.n 

= '^3i 

’ ^8i,n 

= W^. 
3i 

’ *^91, n " 

^lOi ,n 

= x.w^. 

’ ‘^lli,n 

= X.W_. 
1 3i 

’ ^12i,n  ' 

= . 

^13i  ,n 

= ^li^3i 

* ^14i,n 

= W_.W-. 
2i  3i 

^15i, 

,n  ^n  * 
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Suppressing  the  n subscript  on  the  elements  of  Z , these  are  given  by 

<-s/n 


^5  " ^2  ’ ^6  " TT  ^2i  * ^7  " ^3  ’ ^8  " IT  '''si  * 

^9  " fT^i  ’ '^10  "^i=l  ^^■'^2i  ’ ^11  "■^4i  ’ 

^12  "F  '^li'^2i  ’ ^13  " 1 J '"'ii^''3i  ’ ^14  " I ‘^2i^3i  ’ 
1=1  1=1  1=1 


and  Z^g  “ ^n’ 


= (0, l.O.a^. 0.02.0, ag.O.O.O.O.O.O.A^)'  , (3.4.13) 


where  a.  = Var[W.],  i = 1,  2,  3.  As  functions  of  Z we  can  write 
11 


= I X?/"  - = z.  - z? 

XX  1 2 1 


and  similarly 


^ ^9  ■ * ^XW2^"  ^ ^10  * ^1^ 


^XW^"  " ^11  ■ ^1^7  ’ ■ ^4  ■ ^3 
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and. 


“ ^12  “ ^3^5  ’ ' ^13“  ^3^7 

2 

^W2W2^"  " ^6  “ ^5  ’ ^W2W3^"  " ^14  " ^5^7 

^W3W3/'^  ^ ^8  ■ ^7  • 


It  follows  that 


S^^./n  = (Z^2  - Z3Z5)  * 4(Z^3  - Z3Z3)  * 4(Zj^  - ZjZ^)  + -l])  . 

hd’'  ' <^4  - * 2MZj3  - Z3Z7)  + . ^2)  _ 

S^.^./n  = (Zj  - zl>  * 2MZj^  - ZjZ^)  * 42,,^  . ,2)  _ 

Sx^/n  = Ug  - 2^23)  + A(Z^3  - 2^2^)  , 
and 

Sx£'/n  ^ ^^10  " ^1^5^  ’*’  "^^^11  " ^1^7^  • 

Substituting  in  (3.4.11),  = g(2^)  can  be  written  as 


N.(2  0 - N„(Z  ) 

X b <^n 


g(z„)  = — 
V: 


(3.4.14) 


V<£n» 


Where, 


- [Zg-Zj^Z^+AiZj^^-Z^Zy)]  , and 


2 2 


With  u as  given  in  (3.4.13),  W.(u  ) = La‘:,  W,{u„)  = 0. 

'^>1  1 />i»n  no  c 


g(Un)  = 


Vz 


(3.4. 


which  is  nothing  but  Corr(E.E')  given  in  (3.4.10). 
Next,  define 


) 


and 


1 

|j*  — (yj^,  ii£  > • • • * ^]_4^ 


where  ^2*  • • • » are  the  first  14  elements  of  y^  given  in 
(3.4.13),  which  are  free  of  n.  It  follows  that  (see,  for  example. 
Serf ling,  1980,  pp.  125-6) 
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V2  d 


where  I*  is  the  variance-covariance  matrix  of  the  vector 


(X.X^,W^,W^,W2,W2.W3.W3,XW^,XW2,XW3,W^W2,W^W3,W2W3)  . 


The  matrix  I*  may  be  written  in  the  partitioned  form 


*^8x8  I 

0 

f = 

0 1 
1 

^6x6 

where 

1 

£[x^] 

0 

0 

0 

0 

0 

0 

ECX^J 

ELX'^]-] 

0 

0 

0 

0 

0 

0 

0 

0 

E[wJ] 

0 

0 

0 

0 

0 

0 

E[W^] 

E[wJ]-aJ 

0 

0 

0 

0 

M = 

0 

0 

0 

0 

E[W^] 

0 

0 

0 

0 

0 

0 

ELW^]  ELW^J-cr^ 

0 

0 

0 

0 

0 

0 

0 

0 

2 

E[«3] 

0 

0 

0 

0 

0 

0 

E[wi?] 

E[wh-at 
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and 


o2 

1 


S = 


“^2 


2 2 


2 2 
2 3 


Also,  note  that  n converges  in  distribu- 

tion to  a normal  random  variable  degenerate  at  zero,  which  implies 
that 


Vz 


where 


^15(0.1) 


To  obtain  the  asymptotic  variance  of  R = g{Z  ) we  need  the 

n ~n 


vector 
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for  i = i,  2,  . . . , 15. 

Since  the  15^*^  diagonal  element  of  I is  zero,  we  only  need  to 
calculate  the  elements  di,  do,  , di^  of  d.  Our  calculations 

indicated  that,  except  for  d^2  = l/{a^a2),  remaining  elements  of  d 
are  all  zero,  so  that  the  asymptotic  variance  of  g(Zf^)  is 


= d'  ^ d = 1 . 

Now,  g is  a ratio  of  two  polynomial  functions  whose  denominator 
admits  non-zero  second  order  differentials  in  a neighborhood  S of 
Therefore,  g possesses  continuous  second  order  partial  derivatives  in 
a neighborhood  of  y allowing  us  to  apply  Theorem  3.4.12  to  obtain: 
THEOREM  3.4.16 

Linder  conditions  3.4.7  - 3.4.9, 

V2  d 

n Cg(Z„)  - g(ii-)]  > N(0,1), 

where 

~n  n 


and 
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3.5  The  Pitman  Asymptotic  Efficiency  of  Relative  to  R„ 

In  this  section  we  shall  apply  Noether's  generalization  of  a 
theorem  by  Pitman  to  obtain  the  asymptotic  efficiency  of  relative 
to  R^,  which  we  shall  denote  by  ARE(T^,Rj^).  We  first  state  the 
theorem  by  Noether  (1955),  and  then  verify  its  conditions  for  the  two 
statistics  and  R^. 

Theorem  3.5.1  (Noether) 

Consider  testing  Hq:0=0q  versus  H^:0>0q,  let  {Q^}  be  a sequence 

of  alternative  parameters  with  lim  0 = 0„. 

n 0 

n>® 

Suppose  the  test  is  based  on  the  statistic  T^  = T(xj^,  , 

Xf,),  and  let  '*'^(9)  and  <^n^9)  be  functions  of  0 (in  many  cases  these 
are  respectively  the  mean  and  variance  of  T^).  Assume  that 

A.  'p'(0q)  = . . . = 'i'^'"'^^0Q)  = 0 . ^ ^ 

B.  lim  n = c > 0 , for  some  6 > 0 . 

n-H» 

The  indicated  derivatives  are  assumed  to  exist.  We  shall  consider  the 
power  of  the  test  based  on  T^  with  respect  to  the  alternative 
H' :0^=0O+k/n'^  where  k is  an  arbitrary  positive  constant.  In  addition 
to  A and  B assume 

C.  lim  = 1 , 

n+oo 

and 

li«  c„(e„)/^„(e„)  = 1 , 

n-H» 


and 
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0.  The  distribution  of  [T^  - standard 

normal  distribution,  both  under  the  alternative  hypothesis  H'  and 
under  the  null  hypothesis  Hq:0^=0q.  If  and  12^  are  two  statistics 
for  testing  Hq  against  H‘,  and  if  iiij^=rn2=ni.  then  the  ARE  of  the  two 
tests  satisfying  A,  B,  C and  D is  given  by 


lim 

n-H» 


5^ 


= are  (T2„,Ti„) 


9 


Where  R.^(0)  = 'i'j|lj^0)/a.^ (0)  , i = 1,  2. 

Pitman  has  called  the  quantity  efficacy  of  the  i^*^  test 

in  testing  the  hypothesis  Hq:0=0q. 

Our  hypothesis  is  given  by  Hqia=o  versus  H^:a>o,  where  A is  such 

that 


E.  = 


W. . + AW,, 
li  3i 


and 


W„.  + AW,. 
2i  3i 


In  addition  to  assumptions  3.3.1  - 3.3.4  of  section  3.3  and  assump- 
tions 3.4.7  - 3.4.9  of  section  3.4  we  need  the  following  assumption: 
3-5.2  The  density  functions  f|^(.)  of  - W|^2»  k = 1.  2,  3, 

have  continuous  and  bounded  derivatives. 

Next,  we  shall  verify  the  conditions  of  Theorem  3.5.1  for  each  of  the 
statistics  T^  and  R^,  using  the  same  notation  adopted  by  Noether. 
Here,  we  shall  let  {a^}  denote  a sequence  of  alternative  parameters 

converging  to  the  null,  i.e.,  lim  a = 0. 

n 
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Application  of  3.5.1  to  the  statistic  T^: 

With  9=A,  9g=0  and  6 denoting  the  vector  of  slope  parameters,  we 
have  from  (3.3.5) 

f„(A)  = ELF  (3, A)] 

II  n rw 

= 2P{T^+aT3  > 0,  T2+AT3  > 0}  + 2P{T^+aT3  < 0,  T2+AT3  < 0}  - 1 

= {2[1-F^(-aT3)][1-F2(-AT3)] 

3 

+ 2F^{-AT3)F2(-AT3)  - 1}  . (3.5.3) 

where  Tj^  = - W|^2  has  distribution  function  Ej^(.)  and  density 

f|^(.),  k = 1,  2,  3,  and  where  E^  denotes  expectation  with  respect 
to  the  variable  T3.  Therefore,  we  can  write 

'l'^(A)  = /{2[l-F^(-At)[l-F2(-At)]+2F^(-At)F2(-At)-l}dF3(t)  (3.5.4) 

with  'i'^(O)  = 0 since  F^(0)  = F2(0) 

The  integrand  of  the  above  expression  involves  continuous  bounded 
functions,  so  that  by  theorems  such  as  Theorem  A. 2.4  of  Randles  and 
Wolfe  (1979),  the  derivatives  with  respect  to  a may  be  taken  inside 
the  integral,  to  obtain 

V(A)  = /{2tf^(-At)[l-2F2(-At)]+2tf^(-At)[l-2F2(-At)]}dF3(t) 

and,  therefore,  ’i''(0)  = 0 since  F (0)  = F (0)  = U 
n 1 2 ' n- 
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Using  assumption  3.5.2,  we  differentiate  a second  time, 
f^(A)  = /{8t^f^(-At)f2(-At)  - 2t^f|(-At)[l-2F2(-At)] 
2A‘(-At)[l-2F^(-At)j}dF3(t)  , 

/8t^f^(0)f2(0)dF3(t) 

= 8f^(0)f2(0)E[T3] 

= 16a2f^(0)f2(0)  > 0 , (3.5.5) 

since  with  E[W3]  = 0,  ELT3]  = Var[T3]  = Var[W^^-W^2-l  = 2a^. 

This  satisfies  condition  A,  with  m=2.  For  the  remaining  conditions  we 
shall  take  ^ ^ ~ Condition  B follows 

since 


so  that 

r(0)  = 


lim  n'"’'^'?^"’^0)/a  (0) 

n-H»  " ^ 


■ ^/2  2 ^^2 

= lim  n . loa^f  (0)f  (0)/(4/9n) 

n-H»  ^ 


= 24a^f^(0)f2(0)  = c > 0 . 


so  see 


Condition  C is  immediate  from  the  definition  of  a (A  ).  We  al 

n n 

that  under  assumptions  5.3.2 
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litn  r(A  ) = f"(0)  . 


In  sections  2.4  and  3.3,  respectively,  we  have  shown  that 


[T  -'F  (A  )] 
n n n 

(A  ) 
n'  n' 


= LT-9(S)j 

c n n ~ 


d 

> N(0,1)  , 


as  n^,  both  under  the  null  hypothesis  and  under  a sequence  of 
alternatives,  thereby  proving  condition  0.  The  efficacy  of  the  test 
based  on  is  then  given  by 


= L'F;;(0)/a^{0)J^ 

= n.576cr^fj{0)f2(0) 

where  f^^{.)  and  f2(.)  are  the  probability  density  functions  of 
Ti  = Wi2~^12  ^2  ^ '^2l"'^22*  >"espectively,  and  = VarLW^]. 

Application  of  3.5.1  to  the  statistic  R^: 

To  verify  the  conditions  of  Noether’s  theorem  we  shall  let 


and 


= a2(o)  = 1/n  , 


where  <?. 


1 


VarCW.], 


= 1,  2,  3.  Note  that  '^^(0)  = 0,  and 


Ill 


r (A)  = 2Aa2(a^+Aa2)  ^(o^+Ao^)  ^ 

-A^a2(o^+ACT2)  ^(g^+Aa^)  ^ 

SO  that  'i'i^(O)  = 0.  Differentiating  a second  time  and  evaluating  at 
A=0  we  have 


f;j(0)  = 2^3/0^02  > 0 

and  hence  condition  A is  satisfied  with  m=2.  Condition  B is  satisfied 
with  m=2  and  <S  = V4  , since 


lim  n""'V"’^(0)/a  (0)  = lim  n '^.Zaha.o.n 

n-Hx.  >•  ” i I c 

2 

= 203/0^02  = c > 0 . 

Condition  C is  immediate.  Also,  in  the  previous  section  we  have  shown 
that  CRn"'^n^^n^^/‘^n^^n^  converges  in  distribution  to  the  standard 
normal  distribution,  thus  obtaining  condition  D.  The  efficacy  of  the 
test  based  on  is  then  given  by 


R = C't'"(0)/a  (0)]2  = 4nAa202  . 


3'  12 


il2 


THEOREM  3.5.6 

Under  assumptions  3.3.1  - 3.3.3,  3.4.7  - 3.4.9  and  assumption 
3.5.2,  the  asymptotic  efficiency  of  relative  to  is 


Here,  fj^(.)  [f2(.)]  is  the  density  function  of  the  difference  between 
two  i.i.d.  random  variables  (^2  = Since  in  the 

"trivariate  reduction"  model  we  implicitly  assume  knowledge  of  the 
distributions  of  and  W2,  we  need  to  find  fi(0)  and  f2(0)  in  terms 
of  g]^(.)  and  g2(.).  the  respective  densities  of  and  ^2*  It  can  be 
shown  that 


Using  the  above  relation,  we  have  calculated  ARE(Tn,Rf,)  in  the  case 
where  Wp  ^2  *^3  l^^^e  the  same  distribution.  The  results  of  these 

calculations  for  some  well  known  distributions  are  given  in  Table  3.1. 


ARE(T^,R^)  = 144cr^a^f^(0)f2(0)  . 


(3.5.7) 


f.-(O)  = / gj(x)  dx  , i = 1,2  . 


Table  3.1 


Asymptotic  Relative  Efficiencies 


Distribution 


Normal 


9/it^  = 0.912 


Uniform 


1 


Logistic 


1.2 


Laplace 


2.25 


CHAPTER  FOUR 
THE  CORRELATION  PROBLEM 


4.1  Introduction 


Let  ....  denote  a random  sample  of  n 

observable  pairs  from  some  continuous  bivariate  population  with 
distribution  function  F.  As  mentioned  in  chapter  1,  the  problem  of 
interest  in  this  chapter  is  to  test  the  null  hypothesis  that  there  is 
no  correlation  between  the  variables  Y and  Z,  versus  the  alternate 
hypothesis  that  a correlation  exists  between  these  variables.  If  we 
let 


be  the  correlation  coefficient  of  interest,  the  above  hypotheses 
translate  to 


or  the  one-sided  alternatives  of  positive  correlation  (t>0)  or 
negative  correlation  (t<0).  In  chapter  1,  we  discussed  the  motivation 
behind  using  a coefficient  such  as  t,  and  hypotheses  such  as  those 
given  in  (4.1.2).  In  particular,  we  indicated  that,  at  least  to  us,  t 
is  a most  natural  measure  for  a "useful"  relationship  between  the 
variables,  in  the  sense  that  its  values  indicate  whether  larger  values 


T 


= p{(Y^-Y2)(Z^-Z2)>0}  - p{(Y^-Y2)(Z^-Z2)<0} 


(4.1.1) 


versus  H^:  x o 


(4.1.2) 
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of  Y are  associated  with  larger  (or  smaller)  values  of  Z,  and  that, 
therefore,  the  hypotheses  given  in  (4.1.2)  are  most  appropriate.  For 
these  hypotheses,  we  shall  use  tests  based  on  Pearson's  R and 
Kendall's  tau  statistics,  although  classical  tests  based  on  these  two 
statistics  assume  the  null  hypothesis  of  the  independence  of  Y and 
Z.  In  section  4.2,  we  give  a brief  description  of  these  tests  for 
independence,  discuss  their  properties  and  their  limitations  for 
testing  (4.1.2).  Although  the  tests  based  on  Kendall's  tau  and 
Pearson's  R have  different  consistency  classes  (x?^o  for  the  first,  and 
p?^0  for  the  second),  under  the  elliptically  symmetric  models  studied 
in  this  chapter,  these  consistency  classes  are  identical,  since  under 
such  models  is  equivalent  to  p?^0.  We  can  thus  base  tests  on 
either  R or  Kendall's  tau  without  being  unfair  to  either  test.  In 
section  4.3,  we  propose  some  modifications  of  these  tests  in  the  hope 
of  developing  a procedure  for  testing  the  null  hypothesis  that  x=o. 
Section  4.4  contains  the  results  of  a Monte  Carlo  study  investigating 
the  performances  of  these  tests,  and  our  conclusions  and 
recommendations  are  given  in  section  4.5. 


4.2  Some  Tests  for  Independence 
Pearson's  product  moment  correlation  coefficient  is  given  by 


n 

I (Y,-Y)(Z,-Z) 
i=l  ^ ^ 

— r-  (4.2.1) 

o n o V? 

{ I (Y.-Y)2  I (Z,-Z)^} 
i=l  ^ i=l  1 


115 


where 


n 


n 


Y = i Y. /n  and 

i=l  ’ 


Z = I Z./n  . 
i=l  ^ 


The  mean  of  R is 


E[R]  = P + 0(n"^)  . 


and  the  variance  is  given  by 


(4.2.2) 


where  p = Corr{Y,Z).  (See,  for  example,  Cramer,  1966,  p.  359.)  Under 
the  assumption  that  p is  0 and  YjZ  (or  ZjY)  is  normal,  then 


has  the  Student's  t-distribution  with  (n-2)  degrees  of  freedom  (see, 
for  example,  Anderson,  1958,  p.  64).  From  expression  (4.2.2),  we  note 
that  the  asymptotic  variance  of  R depends  on  the  parameter  p.  This 
motivates  the  use  of  a variance-stabilizing  transformation.  Such  a 
transformation  yields  what  is  known  as  Fisher's  Z, 


which  under  the  assumption  of  normality  has  an  limiting  mean  of 


Vz 


T = R[(n-2)/(l-R^)] 


Z In  [(1+R)/(1-R)]  , 


(4.2.3) 
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V2  1n[(l+p)/(l-p)],  and  an  limiting  variance  of  l/(n-3),  so  that  under 
the  hypothesis  of  independence  (p=0),  (n-3)  ^^^2  2 has  an  asymptotic 
standard  normal  distribution.  (See,  for  example,  Anderson,  1958, 
p.  78). 

Kendall's  tau  is  a U-statistic  estimator  of  t given  in  (4.1.1). 

It  is 


1 


■— .1.  Sgn{(Y,-Y.)(Z,-Zj)l 


Where 


1 if  t > 0 


Sgn(t)  = "SO  if  t = 0 . 


-1  if  t < 0 


(4.2.4) 


This  U-statistic  has  a symmetric  kernel  of  degree  2 given  by 


h(Xi,X2)  = Sgn{(Y^-Y2)(Z^-Z2)}  , 


with  X = (Y,Z)'.  Note  that 


E[^]  = E[h(X^,X2)] 


= P{(Y^-Y2)(Z^-Z2)>0}  - P{(Y^-Y2)(Z^-Z2)<0} 


Using  results  on  the  variance  of  a U-statistic,  we  have 
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Var[x]  = — — [2(n-2)c,  + , 

(2) 


where 


hlXi.Xa)]  - 


and 


Letting  hj^(x)  denote  I X^  “ noting  that 

E[hi(Xi)]  = T,  we  can  write 


= M^th{X^.X2)h(X^.X3)]|X^  = x}  - 

= E[h^(X^)]  - 
= Var[h^(X^)]  . 

Under  the  hypothesis  of  the  independence  of  Y and  Z,  x=0  and 
so  that  the  variance  of  x simplifies  to 


Var  fxl  - 

9n(n-l) 


(4.2.5) 


(4.2.6) 


(4.2.7) 


= 1/9. 


(4.2.8) 
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A. 

In  general,  however,  VarCtJ  depends  on  the  underlying  bivariate 
distribution  of  (Y,Z). 

To  compare  the  powers  of  the  tests  based  on  the  statistics  R and 
T,  one  needs  to  define  a suitable  class  of  alternatives,  i.e.,  a class 
of  alternatives  which  is  reasonably  wide  and  reasonably  easy  to  handle 
mathematically.  One  such  class  of  alternatives  was  formulated  by  H.S. 
Konijn  (1956).  Similar  classes  were  also  proposed  by  S.  Bhuchongkul 
(1964)  and  O.Y.  Gokhale  (1978).  To  obtain  the  class  of  alternatives, 
Konijn  defines 


Y = + X2W2 


and 


^ ^3^1  ^ ^4^2  * 


where  and  W2  are  two  independent  random  variables,  and  the 
hypothesis  to  be  tested  is 


Konijn  reports  the  asymptotic  efficiency  of  t relative  to  R for 
several  distributions,  in  the  case  when  and  W2  are  identically 
distributed.  The  values  of  these  AREs  are  9/ir^  = 0.92,  1,  0.86,  and 
1.266  for  the  normal,  uniform,  parabolic  (f(t)=kt^,  for  a_<t<b),  and 
the  Laplace  distributions,  respectively.  To  compare  the  empirical 
powers  of  tests  based  on  the  statistics  R and  t through  a Monte  Carlo 
simulation,  we  adopted  a class  of  alternatives  similar  to  the  one 
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proposed  by  Konijn,  but  involving  only  one  parameter,  A.  This  class 
of  alternatives  was  suggested  by  Hajek  and  Sidak  (1967)  and  is  given 
by 


Y = + aW^ 

and  (4.2.9) 

Z = W2  + aw^  , 

with  Wj,  W2  and  being  mutually  independent,  so  that  the  hypothesis 
of  independence  is  equivalent  to 


Based  on  the  AREs  reported  by  Konijn,  we  expected  Kendall's  tau  to 
perform  better  for  heavy-tailed  distributions.  To  our  surprise, 
however,  we  found  in  our  Monte  Carlo  studies  that  Pearson's  R 
exhibited  a high  degree  of  robustness  in  terms  of  its  stable 
empirical  a-level  and  empirical  power  even  for  such  heavy-tailed 
distributions  such  as  the  Cauchy  distribution.  To  test  the 
broader  null  hypothesis  x=0,  we  calculated  empirical  levels  and 
powers  for  pairs  of  observations  from  some  bivariate  elliptically 
symmetric  distributions  (see  Johnson  and  Ramberg,  1977).  Here,  the 
empirical  a-levels  for  tests  based  on  both  statistics,  R and  x, 
were  largely  inflated,  although  the  a-levels  for  tests  based  on 
Pearson's  R were  much  higher  (details  of  this  and  other  studies 
are  given  in  sections  4.4  and  4.5).  We  suspected  that  these 


120 


inflated  levels  were  due  to  the  fact  that  under  Hq:  t=0,  the 

A 

variances  of  R and  t are  different  from  those  under  the  hypothesis  of 
independence.  This  and  other  observations  motivated  us  to  propose 
some  modifications  to  the  classical  tests  based  on  R and  t.  A 
discussion  of  this  is  given  in  the  next  section. 

4.3  Tests  for  Correlation 

If  the  hypothesis  of  the  independence  of  Y and  Z is  relaxed,  many 

A 

of  the  properties  of  x and  R discussed  in  the  previous  section  no 
longer  hold.  For  example,  under  the  hypothesis  that  x=0,  E[x]=0,  but 

A 

the  variance  of  x depends  on  the  underlying  distribution  F,  and  hence 

A 

T is  neither  distribution-free  nor  asymptotically  distribution-free 
(see  the  expression  for  Var[^]  given  in  (1.2.5)).  From  U-statistic 
theory,  we  know  that 


- V2  d 

(x  - x)/{Var[x]}  ->•  N(0,1) 


and 


n 


V2  . 

(x  - 
(4?^) 


M(0,1) 


as  n-x»  . 


To  test  the  hypothesis  x=0,  Fligner  and  Rust  (1983)  considered  several 

estimators  for  4?^.  They  recommend  the  use  of  the  jackknife  estimator 
"2  . 

Oj  given  by 
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= (n-1)  I - t)2  , (4.3.1) 

i=l 

A / • \ 

where  t ^ is  Kendall's  tau  computed  on  the  subsample  of  size  (n-l) 
formed  by  leaving  out  the  ) pair.  If  one  defines  C^.  as 


C.  = 


n 

I Sgn{(Y  -Y  )(Z.-Z.)}  . i=l,2,, 
j=l  1 J 1 J 


. » n ) 


(4.3.2) 


then,  one  can  show  that 


T = t C./n(n-l)  = C/(n-l)  , 
i=l  ^ 


(4.3.3) 


and  that 


= 4 I (C,-C)^/(n-l)(n-2)^  . 
‘J  i=l  ^ 


(4.3.4) 


where 


n 

C = I C./n  . 
i=l  ^ 


Fligner  and  Rust  obtained  the  statistic 


* V2  ^ ~ 

K = n T/a. 
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n 

I C. 

— r-  • 

\ 9 ? V2 

[n(n-D]  2[  I (C.-C)^] 

1=1  ^ 

and  observed  that,  since  Cj^,  . . . , depend  on  the  observations 

through  their  marginal  rankings,  K*  is  distribution-free  under  the 

hypothesis  of  independence,  and  that  the  tests  based  on  t and  K*  have 

equivalent  consistency  classes  and  asymptotic  relative  efficiencies. 

They  further  note  that  an  advantage  of  K*  over  t is  that  K*  is  also 

asymptotically  distribution-free  under  the  hypothesis  t=o.  One 

drawback  to  using  the  Fligner-Rust  statistic  is  that  may  be 

identically  zero  even  in  non-extreme  cases.  In  a preliminary 

simulation  study,  we  have  discovered  several  rank  configurations  such 

^2 

as  the  one  given  below  where  = 0.  When,  for  example,  the  ranks  are 

Rank  (Y):  12345678 

Rank  (Z):  5 6 7 8 1 2 3 4, 

i = 1,  2,  . . . , 8,  and  therefore  = 0.  For  extreme 
cases  of  "perfect  concordance"  or  "perfect  discordance,"  it  is 
reasonable  to  assume  a very  small  value  for  cr^  (i.e.,  a very  large 
value  for  K*),  thereby  rejecting  the  null  hypothesis  that  t=0. 

However,  such  a procedure  should  not  be  used  for  situations  similar  to 

A 

the  one  given  above  where  x = - i/7  and  hence  no  indication  of  either 
concordance  or  discordance  is  present. 
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Another  estimator  of  Var[r]  was  proposed  by  Noether  (1967). 
Using  the  notation  developed  above,  his  variance  estimator  may  be 
written  as 


^ ^ "I  o ^ o 

A disadvantage  of  this  variance  estimator  is  that  it  may  be 
negative.  For  example,  for  the  rank  configurations  given  above, 
= C = -1,  i = 1,  2,  . . . , n,  and  x = - 1/7  so  that 


Var[t] 


96 

49n(n-l)  * 


We  propose  a variance  estimator  which  is  guaranteed  to  be  posi- 
tive except  for  the  extreme  cases  of  x = ±l.  This  is  the  consistent 

A 

estimator  of  Var[x]  based  on  the  sample  estimators  of  ^2* 

similar  to  those  considered  by  Randles,  Fligner,  Policello  and  Wolfe 
(1980)  and  previously  developed  by  Sen  (1960).  The  variance  of  x is 
given  in  (4.2.5)  as 


Var[x]  = -^2(n-2)  + ?„]  , where 

(2)  ^ ^ 


= Var[hj(X^)]  and  = 1 - , 


h,(x)  = E[h(X.,X5)|X,  = x]  . 


with 
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Since  = t,  and  taking  t to  be  the  estimator  of  x,  the  sample 

estimator  of  may  be  written  as 


^ -I  1 1 ^ 0 

=1  - 


where 


h,(X  ) 
i ~i 


TiV  ji  Sgni(Y,-Yj)(Z,-Z.)l 


(4.3.6) 


(4.3.7) 


A ^2 

Taking  ^2  = to  be  the  estimator  of  ^2»  proposed  estimator  for 

A 

the  variance  of  t is 


Var[t]  = -J—  [2(n-2)  + c ] . 

(2)  ^ 


Using  the  notation  of  Fligner  and  Rust,  we  see  from  (4.3.2)  and 
(4.3.3)  that 


h(Xi)  7^  , so  that 

n ^ CC./(n-l)  - C/(n-l)]‘^ 

^ ^ i=l  ^ 


125 


1 


n(n-l) 


2 


n 

I 

1=1 


[C.-C]' 


which  is  the  same  as  the  Fligner-Rust  estimator  of  used  in 
expression  (4.3.4)  with  (n-1)  and  (n-2)  being  replaced  by  n and  (n-1), 
respectively.  It  follows  that 


Var[x]  = I (C.-C)^  + 1-^^] 

(")  n(n-l)‘^  i=l  ^ 

and  the  corresponding  test  statistic  is 

V2 

■ f/[Var(T)]  . (4.3.8) 

^ 'if 

As  with  K , is  distribution-free  under  the  hypothesis  of 
independence,  and  is  asymptotically  distribution-free  under  the  more 
general  hypothesis  t=0. 

The  null  distribution  of  was  generated  by  a simulation  study 
based  on  10,000  replications.  For  each  replication,  two  independent 
random  samples  each  of  size  n were  generated  from  the  standard  normal 
distribution  using  the  IMSL  library.  At  each  stage,  K^5  was 
calculated  and  a count  was  kept  for  each  possible  value  up  to  three 
decimal  places.  The  upper  tail  critical  values  (rounded  to  2 decimal 

'ic 

places)  of  •^RS  for  selected  a-ievels  and  for  n = 6(1)30  are  given  in 


Table  4.1. 
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Table  4.1 

The  Null  Distribution  of 


Selected  values  of  Upper  tail  critical  values  of  the 

distribution  of  under  the  hypothesis  of  independence. 


n 

a=0.10 

0=0. 05 

ct=0.025 

a=0.01 

a=0.005 

6 

1.31 

1.90 

2.62 

3.29 

4.50 

7 

1.38 

1.87 

2.41 

3.36 

4.14 

8 

1.25 

1.78 

2.22 

2.89 

3.42 

9 

1.27 

1.76 

2.28 

3.00 

3.46 

10 

1.27 

1.74 

2.17 

2.83 

3.40 

11 

1.28 

1.72 

2.23 

2.76 

3.20 

12 

1.27 

1.68 

2.11 

3.75 

3.25 

13 

1.24 

1.68 

2.10 

2.72 

3.20 

14 

1.23 

1.66 

2.07 

2.63 

3.10 

15 

1.25 

1.70 

2.13 

2.61 

2.98 

16 

1.24 

1.67 

2.07 

2.62 

2.98 

17 

1.24 

1.63 

2.07 

2.58 

2.97 

18 

1.26 

1.69 

2.03 

2.58 

2.98 

19 

1.26 

1.69 

2.06 

2.52 

2.98 

20 

1.20 

1.63 

1.97 

2.41 

2.79 

21 

1.26 

1.70 

2.12 

2.60 

2.92 

22 

1.25 

1.62 

2.05 

2.43 

2.77 

23 

1.26 

1.66 

2.04 

2.58 

2.94 

24 

1.25 

1.67 

2.07 

2.55 

2.84 

25 

1.22 

1.65 

2.02 

2.55 

2.85 

26 

1.24 

1.63 

2.01 

2.43 

2.77 

27 

1.25 

1.66 

1.98 

2.39 

2.71 

28 

1.25 

1.63 

2.00 

2.47 

2.31 

29 

1.27 

1.68 

2.04 

2.54 

2.84 

30 

1.23 

1.62 

2.00 

2.44 

2.77 
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4.4  Empirical  Power  Comparisons 
The  performances  of  three  statistics  based  on  Kendall's  t and 
four  statistics  based  on  Pearson's  R were  investigated  through  a Monte 
Carlo  simulation  study,  each  with  1000  replications.  The  statistics 
considered  were  the  following 


n ^ 

1)  K = (2)  T was  compared  to  table  A. 21  of  Hollander  and  Wolfe 
(1973). 

2)  ZK  = K/[n{n-l) (2n+5)/18]  ^2,  which  is  Kendall's  statistic 
standardized  by  the  variance  under  the  hypothesis  of  independence, 
was  compared  to  the  0.05  upper  critical  value  of  the  standard 
normal  distribution  Zq^q^  = 1.645.  For  n=8,  both  a correction  for 
continuity  (adjusted  by  1 rather  than  by  0.5  since  K takes  on  only 
even  values)  and  randomization  for  an  exact  a=0.05  level  through 
the  use  of  a Uniform  [0,1]  random  variable  were  employed. 

3)  Our  proposed  statistic,  K^^,  was  compared  to  the  simulated 
critical  values  given  in  table  4.1. 


4)  T = R { — } was  compared  to  the  upper  0.05  cut-off  value 

l-R*^ 

of  the  Student's  t-distribution  with  (n-2)  degrees  of  freedom. 

5)  The  standardized  Fisher's  Z, 


where 


FZ  = Z/[  ] 

n-3 


V2 


> 


Z = -^  In  C(l+R)/(1-R)] 
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was  compared  to  Zq  = 1.645. 

6)  and  7)  RJ  

[Varj(R)]^^2 

and 


[Varj{Z)]^^2 

where  Varj(R)  and  Varj(Z)  are  the  jackknife  estimators  of  the 
variances  of  R and  Z,  respectively,  were  compared  to  Zq  = 1.645. 

The  use  of  these  jackknife  estimators  was  motivated  by  our  suspicion 
that  they  may  improve  the  performances  of  the  tests  based  on  R or  Z 
when  the  assumptions  of  normality  and/or  independence  were  no  longer 
present.  Fisher's  Z transform  was  included  in  this  study  not  only  for 
completeness  but  also  because  of  some  of  its  desirable  properties  such 
as  its  stabilized  variance,  and  the  fact  that  it  is  "more  nearly 
normal"  than  R.  Furthermore,  most  advocates  of  the  jackknife 
recommend  variance  stabilizing  transformations  to  "keep  the  jackknife 
on  scale  and  thus  prevent  distortion  of  the  results"  (see,  for 
example,  Hinkley,  1977,  1978,  and  Miller,  1974).  The  jackknife 
estimators  of  R and  Z were  obtained  by  a procedure  similar  to  that 
given  in  Hinkley  (1978).  First,  we  calculate  the  pseudovalues 

= nR  - (n-l)R^^'^ 


and 
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= nZ  - (n-l)Z^^'^  , 1=1,2,. ..,n 

where  is  the  product  moment  correlation  coefficient  based  on  a 
sample  of  size  (n-1)  obtained  by  deleting  the  i^*^  pair,  and  Z^^^  is 
the  corresponding  Fisher  transform;  i.e.. 


,(i)  - 1 i„  rl  * ’ I 


The  sample  variances  of  the  pseudovalues  are  then  given  by 


VR  = [PR^^^  - PR]^ 

and 

where 

PR’"!-  I Pr''*  and  PT  = i T PZ*'’  . 

" 1=1  " i< 

The  recommended  variance  estimates  of  R and  Z are  then 

Varj(R)=-!!^  and  varj(Z)  = . 

In  the  computer  algorithm  to  calculate  these  jackknife  estimators, 
some  useful  recursive  relations  were  used  which  enable  one  to  update 
sample  variances  and  covariances  when  the  sample  is  augmented  by  an 
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additional  observation.  These  relations  are  derived  to  be 


n „ n-1 


I (X.-X  = I (X.-X 


i=l  ' " i=l 


and 


n 


n-1 


n-1 


where 


The  results  of  the  Monte  Carlo  study  based  on  1000  samples  each 
comprised  of  n=8  and  n=20  pairs  of  observations  are  given  in  tables 
4. 2-4. 5.  The  empirical  sizes  (powers)  corresponding  to  the  seven 
tests  listed  above  were  computed  for  several  bivariate 
distributions.  For  the  hypothesis  of  independence,  the  pairs  (Y,Z) 
were  formed  by  letting 


where  Wj^,  W2  and  W3  are  independent  random  variables,  so  that  the 
hypothesis  of  independence  is  equivalent  to  testing  A=0.  For  each  of 
the  1000  iterations,  3n  i.i.d.  random  variates  were  generated  from  a 
specific  distribution  using  IMSL  subroutines.  These  were  divided  into 


Y = + AW3 


and 


(4.4.1) 


z = W2  + AW^ 
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three  groups,  each  of  size  n (n=8  and  n=20),  to  obtain  {W2j} 

and  {W3^-},  i = 1,  2,  . . . , n.  The  pairs  (Y^,Z^-),  i = 1,  2,  . . . , 
n,  were  then  obtained  by  relations  (4.4.1),  for  A = 0.0,  1.0  and  2.0 
(i.e.,  when  Corr(Y,Z)  = 0.0,  0.5  and  0.8,  respectively).  The  seven 
statistics  mentioned  earlier  were  calculated  from  these  pairs,  and 
were  compared  to  their  corresponding  cut-off  values  to  obtain  the 
empirical  powers.  The  results  for  the  standard  normal,  the  Uniform 
[0,1],  and  the  Cauchy  distributions  are  given  in  table  4.3. 

To  test  the  hypothesis  Hg:  t=o,  the  seven  statistics  under 

investigation  were  calculated  on  (Yj^,Zj.),  ....  (Y^,Z^),  n=8  and 
n=20,  but  here  the  (Y,Z)  pairs  were  generated  from  such  bivariate 
distributions  as  the  bivariate  Cauchy,  the  Pearson  Type  II  and  the 
Pearson  Type  VII  distributions  (see  Johnson  and  Ramberg,  1977).  In 
the  case  of  such  elliptically  symmetric  distributions,  ^=0  is 
equivalent  to  p=Corr (Y,Z)=0  which  in  turn  is  equivalent  to  t=o.  To 
generate  these  bivariate  observations,  we  have  adopted  the  procedures 
given  by  Johnson  and  Ramberg  (1977).  To  form  a (Y,Z)  pair,  we  first 
implement  IMSL  subroutines  to  obtain  two  random  independent  U[0,1] 
variates,  Uj^  and  82*  For  each  of  the  three  bivariate  distributions 
mentioned  above,  and  U2  are  transformed  into  two  uncorrelated 
variables  and  X2,  by  appropriate  transformations  discussed  below. 
The  pair  (Y,Z)  is  then  obtained  by 


Z = XX, 


(l-x^) 


V2 


and 


(4.4.2) 
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where  0 ^ X 1,  and  Corr(Y,Z)  = X,  if  and  X2  have  finite  equal 
variances.  For  the  bivariate  Cauchy  distributions  which  is  a heavy- 
tailed distribution  with  no  moments,  the  transforms  X^  and  X2  are 
obtained  as  follows. 


V2 

Xi  = (U’^  - 1)  Cos(2ttU2) 
and 

V2 

X2  = - 1)  Sin(2iTU2)  . 

The  Pearson  Type  II  is  a light- tailed  distribution  which  converges  to 
the  bivariate  normal  distribution  as  the  shape  parameter  v increases 
to  infinity.  Here  and  X2  are  obtained  by 


X^  =(1 


Uj  ) 


Cos{2irU2) 


and 


X2  =(1 


Sin(2TrU2) 


The  Pearson  Type  VII  distribution  is  more  heavy-tailed  than  the 
bivariate  normal  distribution,  with  the  tail  weight  increasing  as  the 
parameter  v decreases.  Xj^  and  X2  are  given  by 


X 


1 


/(1-v) 

1 


V2 

1)  Cos(2ttU2) 


and 
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Table  4.2 

Relative  Frequency  of  Rejecting  Hq 
(nominal  a=0.05) 


Distribution 

Tests  Based  on  x 

Tests 

Based  on  R 

A 

K ZK  K^s 

T FZ 

RJ  ZJ 

of  Wj^,W2,W3 

n=8 


0.0 

0.048 

0.048 

0.055 

0.042 

0.040 

0.087 

0.060 

Normal 

1.0 

0.311 

0.311 

0.314 

0.376 

0.370 

0.487 

0.330 

2.0 

0.731 

0.731 

0.737 

0.853 

0.850 

0.882 

0.752 

0.0 

0.048 

0.048 

0.055 

0.051 

0.049 

0.090 

0.057 

Un i form 

1.0 

0.288 

0.288 

0.306 

0.347 

0.340 

0.490 

0.343 

2.0 

0.754 

0.754 

0.772 

0.894 

0.891 

0.933 

0.838 

0.0 

0.046 

0.046 

0.061 

0.058 

0.057 

0.076 

0.048 

Cauchy 

1.0 

0.340 

0.340 

0.331 

0.409 

0.407 

0.384 

0.204 

2.0 

0.529 

0.529 

0.509 

0.572 

0.568 

0.548 

0.365 

n 

II 

i>o 

o 

0.0 

0.060 

0.059 

0.059 

0.056 

0.056 

0.077 

0.065 

Normal 

1.0 

0.688 

0.684 

0.676 

0.766 

0.762 

0.777 

0.724 

2.0 

0.994 

0.994 

0.992 

0.998 

0.998 

0.997 

0.994 

0.0 

0.060 

0.059 

0.059 

0.066 

0.065 

0.072 

0.058 

Uniform 

1.0 

0.703 

0.699 

0.721 

0.779 

0.776 

0.857 

0.817 

2.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

0.0 

0.047 

0.046 

0.047 

0.057 

0.056 

0.028 

0.024 

Cauchy 

1.0 

0.622 

0.619 

0.568 

0.512 

0.512 

0.344 

0.227 

2.0 

0.902 

0.901 

0.850 

0.717 

0.717 

0.541 

0.356 
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Table  4.3 

Relative  Frequency  of  Rejecting  Hq 
(nominal  a=0.05) 


Oistn. 
of  (Y.Z) 

V 

X 

Tests  Based 

A 

on  T 

Tests  Based  on 

R 

K 

ZK 

Krs 

T 

FZ 

RJ 

ZJ 

n= 

8 

1.0 

0.0 

0.025 

0.025 

0.033 

0.030 

0.028 

0.094 

0.053 

P 

0.5 

0.283 

0.285 

0.330 

0.334 

0.325 

0.530 

0.383 

C 

0.8 

0.770 

0.772 

0.770 

0.877 

0.871 

0.929 

0.833 

L 

5.0 

0.0 

0.039 

0.039 

0.051 

0.052 

0.049 

0.107 

0.057 

A 

0.5 

0.303 

0.305 

0.315 

0.355 

0.350 

0.497 

0.369 

0 

0.8 

0.770 

0.772 

0.759 

0.860 

0.857 

0.904 

0.775 

K 

c 

n= 

20 

1.0 

0.0 

0.023 

0.023 

0.036 

0.025 

0.025 

0.059 

0.05 

0 

0.5 

0.743 

0.740 

0.791 

0.810 

0.807 

0.832 

0.865 

K1 

0.8 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

ri 

5.0 

0.0 

0.042 

0.041 

0.049 

0.035 

0.034 

0.061 

0.053 

0.5 

0.711 

0.708 

0.708 

0.778 

0.777 

0.830 

0.782 

II 

0.8 

0.999 

0.999 

0.998 

1.0 

1.0 

1.0 

1.0 
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Table  4.4 

Relative  Frequency  of  Rejecting  rig 
(nominal  a=0.05) 


Tests  Based 

on  T 

Tests  Based  on 

R 

Distn. 
of  (Y,Z) 

V 

X 

K 

ZK 

^RS 

T 

FZ 

RJ 

ZJ 

n 

=8 

2.0 

0.0 

0.076 

0.077 

0.075 

0.143 

0.139 

0.130 

0.075 

P 

0.5 

0.354 

0.356 

0.331 

0.457 

0.453 

0.470 

0.317 

F 

0.8 

0.732 

0.734 

0.707 

0.793 

0.789 

0.802 

0.630 

1.25 

0.0 

0.106 

0.107 

0.091 

0.358 

0.356 

0.177 

0.083 

A 

0.5 

0.383 

0.385 

0.338 

0.580 

0.574 

0.426 

0.198 

R 

s 

0.8 

0.702 

0.704 

0.641 

0.783 

0.782 

0.690 

0.380 

n 

o 

CM 

II 

2.0 

0.0 

0.078 

0.077 

0.059 

0.214 

0.214 

0.104 

0.073 

0 

0.5 

0.670 

0.669 

0.593 

0.672 

0.671 

0.583 

0.473 

H 

0.8 

0.985 

0.985 

0.968 

0.945 

0.944 

0.905 

0.813 

1.25 

0.0 

0.101 

0.100 

0.070 

0.434 

0.433 

0.166 

0.061 

0.5 

0.643 

0.642 

0.511 

0.659 

0.659 

0.444 

0.198 

VII 

0.8 

0.962 

0.962 

0.915 

0.836 

0.835 

0.703 

0.356 
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Table  4.5 

Relative  Frequency  of  Rejecting  Hg 
(nominal  »=0.05) 


Distn. 
of  (Y,Z) 

Tests 

; Based 

A 

on  T 

Tests  Based  on 

R 

X 

K 

ZK 

Krs  t 

FZ 

RJ 

ZJ 

B 

n=8 

I 

0.0 

0.089 

0.090 

0.084  0.245 

0.243 

0.151 

0.091 

V 

0.5 

0.368 

0.370 

0.323  0.516 

0.515 

0.443 

0.263 

A 

R 

I 

A 

T 

E 

0.8 

0.727 

0.729 

0.683  0.782 

n=20 

0.776 

0.744 

0.522 

0.0 

0.091 

0.090 

0.065  0.350 

0.350 

0.131 

0.075 

C 

0.5 

0.653 

0.649 

0.551  0.660 

0.660 

0.481 

0.326 

A 

U 

C 

H 

Y 

0.8 

0.977 

0.977 

0.947  0.878 

0.878 

0.793 

0.599 
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^2  " Sin(2iTU2)  . 

Note  that  for  ^ = 1,5,  the  Pearson  VII  is  equivalent  to  the  bivariate 
Cauchy  distribution.  The  results  of  the  Monte  Carlo  study  for 
^ = 0.0,  0.5  and  0.8  and  for  selected  values  of  v are  given  in  tables 
4. 3-4. 5. 

4.5  Conclusions  and  Recommendations 
In  many  cases,  it  is  difficult  to  compare  the  powers  of  these 
tests  especially  when  the  corresponding  ct-ievels  are  highly  different 
for  the  different  tests.  In  this  discussion  we  present  what  we 
believe  to  be  a reasonable  set  of  conclusions  drawn  from  our  study. 

One  such  conclusion  is  that  for  the  hypothesis  of  independence,  the 
tests  based  on  R,  namely  T and  FZ  are  highly  robust  in  the  sense  of 
having  stable  sizes  and  powers,  as  may  be  seen  in  table  4.2.  This  was 
to  be  expected  for  light-tailed  distributions,  as  was  indicated  by  the 
ARE  calculations  given  in  section  1.2.  However,  for  n=20  the  tests 
based  on  Kendall's  tau  have  slightly  higher  powers  for  a heavy-tailed 
distribution  such  as  the  Cauchy,  although  for  n=8  the  performance  of 
the  tests  T and  FZ  is  comparable  to,  if  not  better  than,  that  of  the 
tests  based  on  j^e  tests  based  on  R also  do  well  for  the 
hypothesis  Hg:  t=o  when  the  observations  come  from  a light-tailed 

bivariate  distribution  such  as  the  Pearson  II  (see  table  4.3).  For 
n=8,  both  T and  FZ  perform  remarkably  well  in  terms  of  holding  their 
a-levels  and  powers,  while  for  n=20,  the  tests  using  the  jackknife 
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variance  estimators,  i.e.,  RJ  and  ZJ,  do  considerably  better,  followed 
by  the  test  '^RS*  For  the  hypothesis  Hq:  t=o,  and  for  heavier-tailed 

bivariate  distributions  such  as  the  Pearson  VII  or  the  bivariate 
Cauchy  the  tests  based  on  Kendall's  tau  do  extremely  well.  Except  for 
the  test  ZJ,  which  is  Fisher's  Z transform  standardized  by  the 
jackknife  estimator  of  standard  error,  all  tests  based  on  R have 
highly  inflated  a-levels,  and  hence  should  not  be  included  in  any 

A 

power  comparisons.  Of  the  remaining  tests,  those  based  on  t exhibit 
the  highest  empirical  powers.  In  particular,  our  test  performs 
the  best  both  in  terms  of  empirical  “-level  and  power. 

In  summary,  we  note  that  for  the  hypothesis  of  independence  the 
tests  based  on  Pearson's  R are,  in  most  cases,  remarkably  robust  in 
terms  of  both  size  and  power.  For  the  hypothesis  t=0,  the  tests  based 
on  K^£  are  consistently  better  except  in  the  case  of  the  Pearson  II 
distribution  where  ZJ  has  slightly  higher  powers.  However,  in 
practice  one  must  take  into  consideration  the  ease  with  which  a 
particular  statistic  is  calculated.  As  can  be  seen  from  the  previous 
section,  the  computation  of  a statistic  such  as  ZJ  is  very  tedious 
compared  to  that  of  which  is  a function  of  the  C^'s  which  are 
naturally  calculated  in  a Kendall's  tau  problem.  Based  on  the  above 
discussions,  our  final  recommendations  are 

1)  For  the  hypothesis  of  independence,  we  recommend  using  a simple 
test  based  on  R such  as  T or  FZ,  except  for  large  n (_>20)  and 
heavy-tailed  distributions  where  we  recommend  the  use  of  a test 
based  on  the  ordinary  Kendall's  tau  such  as  K or  ZK. 
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2)  For  the  hypothesis  Hq:  t=0,  we  recommend  a test  based  on  our 

statistic  in  all  situations.  Furthermore,  it  is  important  to 
note  that  may  also  be  used  to  construct  confidence  intervals 
for  T.  For  small  samples  we  recommend  the  use  of  table  4.1,  while 
for  large  samples  (n>30)  one  may  use  the  appropriate  percentiles 
of  the  standard  normal  distribution. 


CHAPTER  FIVE 

MONTE  CARLO  RESULTS  ANO  CONCLUSIONS 
5.1  Introduction 

In  Chapters  2 and  3,  we  discussed  two  tests  for  partial 
correlation  based  respectively  on  T^,  Kendall's  tau  statistic 
calculated  on  the  residuals,  and  R^,  Pearson's  partial  correlation 
coefficient.  Based  on  the  values  of  ARE(T^.R^)  calculated  in  Chapter 
3,  we  concluded  that,  for  large  samples  and  under  the  null  hypothesis 
of  the  independence  of  E and  E (the  "error  variables"  in  the  linear 
models  relating  Y to  X,  and  Z to  X,  respectively),  T^^  performs  better 
than  Rp  for  heavy  tailed  distributions.  In  Chapter  4,  we  studied  the 
usual  correlation  problem  and  discussed  several  statistics  for  testing 
the  null  hypothesis  x=o,  where  t was  Kendall's  correlation 
coefficient  between  the  variables  Y and  Z.  In  this  chapter,  a Monte 
Carlo  study  is  used  to  investigate  the  performances  of  the  tests  based 
on  T^,  R^,  and  statistics  similar  to  those  discussed  in  Chapter  4 but 
here  calculated  on  the  residuals  from  the  fit  involving  the  covariate 
X. 

In  section  5.2,  we  shall  discuss  statistics  similar  to  the  ones 
studied  in  Chapter  4 but  modified  to  fit  the  partial  correlation 
setting,  and  tabulate  their  simulated  null  distributions.  Section  5.3 
contains  a description  of  our  Monte  Carlo  study  and  the  tables  of 
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results.  Section  5.4  contains  our  overall  conclusions  and 
recommendations.  In  section  5.5,  we  give  a brief  list  of  related 
topics  open  for  future  research  and  investigation. 

5.2  More  Tests  for  Partial  Correlation 
In  this  section,  we  shall  develop  some  statistics  for  testing  a 
broader  null  hypothesis  than  that  of  the  independence  between  E and 

I 

E . In  particular,  we  shall  be  interested  in  testing 

H : T = 0 versus  H : t > 0 , (5.2.1) 

0 d 


where 


T = P{(E^-E2)(Ej-E2)>0}  - P{(E^-E2)(e‘-E2)<0}  . (5.2.2) 

The  two  primary  measures  for  partial  correlation  discussed  in  Chapters 
2 and  3 are 


and 


n 

I (u.-Ohv.-v) 

^ 

n pH  V2 

( l (Uj-0)2  I (v.-v)2) 

i=l  ^ i=l  ’ 


(5.2.3) 


(5.2.4) 
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where  ....  are  the  residuals  obtained  from  fitting 

the  linear  models 


Y,  = .x^  . a^x,  . E, 


and 


(5.2.5) 

^i  ' “2  ^2^i  ^ ^i  * i=1.2,...,n. 


To  test  the  hypotheses  of  (5.2.1),  in  addition  to  the  statistics 
and  Rf,,  we  also  use  statistics  similar  to  those  discussed  in  Chapter 
4.  One  such  statistic  is  which  is  the  statistic  given  in 
(4.3.8)  but  applied  to  the  residuals.  That  is,  if 


n 

C.  = I Sgn{(U.-U.)(V,-V.)}  , 
1 j=l  1 J 1 J 


the  statistic  K* 


may  be  written  as 


(5.2.6) 


- 1 ” 

where  C = — C^.  , and  T^  is  Kendall's  tau  applied  to  the  residuals 

as  given  in  (5.2.2). 

The  distribution  of  K*  under  the  null  hypothesis  that  t=0  was 
generated  by  a Monte  Carlo  simulation  study  in  two  cases:  (i)  when 

the  residuals  were  obtained  by  the  OLS  fit  and  (ii)  when  they  were 
obtained  by  the  LAV  fit.  In  each  of  these  two  cases,  the  residuals 
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were  obtained  from  the  models 

Y.  = X.  + E. 

1 1 1 

and 

= X^.  + E^.  , i=l,2,...,n  , 

where  X^,  i = l,2,...,n,  are  i.i.d.  standard  normal  variables 
generated  by  IMSL  subroutines,  and  (E^-,E^-),  i = l,2,...,n,  are  pairs 
of  observations  from  the  Pearson  Type  VII  distribution  with  X=o  (i.e., 
T=0)  and  v=2  and  generated  by  the  procedures  described  in  section 
4.4.  From  these  residuals  the  statistic  K*  was  calculated  and  its 
value  recorded.  This  process  was  repeated  10,000  times.  The  upper 
tail  critical  values  of  K*  for  selected  values  of  a and  for  n = 6(1)20 
are  given  in  Table  5.1  (the  OLS  fit)  and  Table  5.2  (the  LAY  fit). 

It  must  be  noted  that  the  use  of  the  Pearson  Type  VII 
distribution  with  v=2  to  generate  the  null  (t=o)  distribution  of  K* 
was  not  altogether  arbitrary.  This  choice  was  motivated  by  the  fact 
that  this  particular  distribution  is  "close"  to  the  bivariate  normal 
distribution  in  terms  of  having  moments  and  in  terms  of  tail  weight 
(it  has  a slightly  heavier  tail  than  the  bivariate  normal 
distribution),  but  it  is  more  appropriate  than  the  bivariate  normal 
distribution  for  testing  the  null  hypothesis  t=o  since  under  the 
Pearson  Type  VII  distribution,  X=0  (t=0  and  P=0)  does  not  necessarily 

I 

imply  that  E and  E are  statistically  independent. 


n 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
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Table  5.1 

The  Null  Oistribution  of  K*  (OLS  fit) 


“=0.10 

“=0.05 

«=0.025 

a=0.01 

a=0.005 

1.90 

2.62 

- 

- 

- 

1.87 

2.60 

3.36 

4.72 

6.48 

1.80 

2.44 

3.15 

4.61 

5.22 

1.68 

2.35 

3.05 

4.11 

5.00 

1.66 

2.26 

2.96 

4.14 

4.95 

1.64 

2.24 

2.39 

4.05 

4.96 

1.62 

2.21 

2.89 

3.88 

4.95 

1.59 

2.18 

2.83 

3.74 

4.65 

1.57 

2.18 

2.83 

3.72 

4.5 

1.56 

2.12 

2.77 

3.52 

4.41 

1.54 

2.11 

2.70 

3.57 

4.45 

1.52 

2.08 

2.71 

3.56 

4.30 

1.51 

2.07 

2.65 

3.43 

4.14 

1.50 

2.01 

2.63 

3.38 

3.98 

1.52 

2.03 

2.62 

3.44 

4.00 

n 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
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Table  5.2 

The  Null  Distribution  of  K*  (LAV  fit) 


a=0.10 

a=0 . 05 

ci=0.025 

a=0.01 

a=0.005 

1.67 

2.90 

- 

- 

- 

1.59 

2.40 

3.36 

4.72 

6.45 

1.43 

2.05 

2.80 

3.94 

5.22 

1.44 

2.00 

2.61 

3.60 

4.44 

1.39 

1.96 

2.52 

3.30 

4.18 

1.42 

1.97 

2.48 

3.15 

3.71 

1.41 

1.92 

2.38 

3.07 

3.73 

1.38 

1.88 

2.36 

3.00 

3.41 

1.37 

1.85 

2.28 

3.01 

3.43 

1.34 

1.82 

2.29 

2.98 

3.55 

1.36 

1.81 

2.30 

2.89 

3.26 

1.35 

1.78 

2.22 

2.87 

3.31 

1.33 

1.75 

2.21 

2.72 

3.18 

1.34 

1.76 

2.16 

2.78 

3.15 

1.34 

1.76 

2.21 

2.76 

3.17 
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5.3  The  Monte  Carlo  Study 

In  this  section  we  compare  the  performances  of  the  tests  based  on 
^n*  *^n  *^n  through  a Monte  Carlo  simulation  study  with  1000 

replications.  The  hypotheses  of  interest  are 


and 


E and  E are  independent 

Hq;  t = 0 . 


(5.3.1) 


where  t is  as  defined  in  (5.2.2),  and  E and  E are  the  error  variables 
of  the  model  structures  (5.2.5).  Throughout  tnis  study  we  have  taken 
the  variable  X to  have  the  standard  normal  distribution,  and  have  let 
the  pair  (E,E  ) assume  a variety  of  different  bivariate  distributions. 

For  the  hypothesis  of  independence,  the  class  of  alternatives  is 
defined  by  the  "trivariate  reduction"  model  given  by 


E = 
and 

e'  = W2  + , 

where  Wj^,  W2  and  are  mutually  independent  continuous  random 
variables,  and  a is  a constant.  The  hypothesis  of  independence 
(5.3.1)  is  then  equivalent  to 


A = 0 . 
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For  the  hypothesis  (5.3.2),  we  have  taken  the  variable  {£,£  ) to 
have  an  elliptically  synmetric  distribution  with  "association 
parameter"  x,  so  that  the  hypothesis  (5.3.2)  is  equivalent  to 


where  x is  as  given  in  (4.4.2). 

The  variables  X^,  X2,  . . . , were  generated  by  the  IMSL 
subroutine  GGNML,  and  the  pairs  (£,£',),  ....  (E^.t^)  were  obtained 
by  the  exact  same  procedures  used  to  generate  the  variables  (Y^^.Z^), 
...»  (Yp,Z^)  in  section  4.4.  The  variables  of  interest  under  the 
partial  correlation  setting  were  then  formed  by  calculating 

Y.  = X.  + £. 

1 1 1 

and 

^i  ^ ^i  ^i  ’ i=l,2,...,n  . 

From  the  above  linear  models,  pairs  of  residuals  (U;^,V^),  . . . , 
(Un.Vn)  were  obtained  from  (i)  the  OLS  fit,  and  (ii)  the  LAV  fit,  and 
from  each  of  the  two  sets  of  residual  pairs  the  statistics  T^,  K*  and 
were  calculated.  Based  on  these  statistics,  the  performances  of 
the  following  seven  tests  were  compared. 

Tests  based  on  T : 

— n 

(i)  Tj^  = (^IT^  was  compared  to  table  A. 21  of  Hollander  and  Wolfe 
(1973). 

(ii)  T2  = (^)T^  was  compared  to  the  tables  of  the  simulated 
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distributions  of  under  the  hypothesis  of  independence.  It 
was  compared  to  values  of  table  2.1  for  the  OLS  fit,  and  table 
2.2  for  the  LAV  fit. 

T 

( i i i ) = , 

{(2n+5)/18n(n-l)}  ^^2 

which  is  standardized  by  the  variance  of  the  ordinary 
Kendall's  tau  under  independence,  was  compared  to  the  upper 
a=0.05  critical  value  of  the  standard  normal  distribution  Zq 
= 1.645. 

For  each  of  the  above  three  tests  randomization  was  employed  to  obtain 
an  exact  a=0.05  level. 

Tests  based  on  K*: 

The  three  tests  Kj^,  K£  and  K3,  respectively,  were  obtained  by 
comparing  K^  to 

(i)  the  a=0.05  cutoff  values  of  the  distribution  of  hs  (K*  under 
the  ordinary  correlation  problem  given  in  table  4.1), 

(ii)  the  a=0.05  cutoff  values  of  the  simulated  null  distribution  of 

■jf  I 

Kp  when  {E,E  ) has  the  bivariate  normal  distribution.  Only 
selected  cutoff  values  were  generated  for  completion.  For 
reasons  we  discussed  in  the  previous  section,  we  recommend 
using  tables  5.1  and  5.2  which  contain  the  null  distribution  of 

ic  ' 

when  (E,E  ) has  the  Pearson  VII  distribution,  and 

(iii)  by  comparing  K*  to  the  ct=0.05  critical  value  of  table  5.1  (for 
the  OLS  fit)  and  table  5.2  (for  the  LAV  fit). 
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Tests  based  on  R„: 


value  of  the  Student's  t-distribution  with  (n-3)  degrees  of 
freedom. 


} was  compared  to  the  upper  0.05  cutoff 


Z , 1 + R 

(ii)  R2  = — — where  = j In  {p_-  (5.3.3) 


was  compared  to  ^0.05  ~ 1*645. 

V2 

(iii)  R^  = ^ compared  to  Zq  = 1.645. 

V2 

(iv)  R^  = Z/{Varj(Z^)}  was  compared  to  Zq  = 1.645,  where  Z 
is  as  given  in  (5.3.3). 

The  jackknife  variance  estimators  Varj(R^)  and  Varj(Z^)  were  obtained 
by  the  procedures  discussed  in  section  4.4  but  applied  here  to  the 
residual  pairs. 

The  relative  frequencies  of  rejecting  Hq  for  sample  sizes  n=8  and 
n=20,  and  for  various  distributions  are  given  in  tables  5.3-5.12. 
Tables  5. 3-5. 6 contain  the  results  for  the  hypothesis  of  conditional 
independence  where  the  class  of  alternatives  is  given  by  the 
"tri variate  reduction"  model.  Tables  5.7-5.12  contain  the  results 

I 

when  (E,£  ) has  an  elliptically  symmetric  bivariate  distribution. 


Relative  Frequency  of  Rejecting  (OLS  fit) 
Tri variate  Reduction  Model 
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5.4  Conclusions  and  Recommendations 

From  tables  5. 3-5. 6 we  can  see  that  for  the  hypothesis  of 
conditional  independence  and  under  the  "trivariate  reduction"  model 
the  performance  of  Pearson's  partial  correlation  coefficient  (R^  in 
particular)  is  very  remarkable.  For  both  the  OLS  fit  and  the  LAV  fit 
and  for  both  small  and  large  samples  the  test  exhibits  an 
unexpectedly  high  degree  of  robustness  in  terms  of  both  size  and 
power.  This  is  perhaps  due  to  the  fact  that  the  "tri variate 
reduction"  model  induces  a linear  structure  between  E and  e',  which  is 
the  type  of  structure  which  occurs  in  the  normal  theory  models  for 
which  Pearson's  statistic  is  designed.  For  n=20  and  for  heavy-tailed 
distributions  such  as  the  Cauchy  the  tests  based  on  have  slightly 
higher  powers,  but  this  is  perhaps  due  to  their  inflated  a-ievels  (see 
tables  5.5  and  5.6). 

For  the  null  hypothesis  that  t=o,  and  for  very  light-tailed 
distributions  such  as  the  Pearson  II  (see  tables  5.7  and  5.8)  the 
performances  of  the  tests  based  on  R^  are  again  superior  to  those  of 
the  other  tests.  For  n=20  under  the  OLS  fit,  and  for  both  n=8  and 
n=20  in  the  case  of  the  LAV  fit  the  tests  R^  and  R2  are  conservative 
(have  low  a-levels)  for  very  light-tailed  distributions  (the  Pearson 
II  with  v=1.0).  In  such  cases  the  test  R4  performs  the  best 
overall.  However,  due  to  the  difficulty  involved  in  calculating  the 
statistic  R4  and  since  in  practice  one  is  not  usually  certain  how 
light  tailed  the  underlying  distribution  is,  a statistic  such  as  R^ 
seems  to  be  a better  choice. 
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For  medium  to  heavy-tailed  distributions  and  for  testing  the  null 
hypothesis  that  t=0,  tables  5.9-5.12  indicate  that  tests  based  on  R 

n 

have  highly  inflated  a-levels,  low  powers  or  both.  The  best  overall 

* 

performance  in  terms  of  both  size  and  power  is  that  of  the  test 
which  uses  the  null  distribution  of  given  in  tables  5.1  and  5.2. 
However,  under  very  heavy-tailed  distributions  such  as  the  Pearson  VII 
with  v=1.25,  and  with  the  OLS  fit  (see  table  5.9)  the  test  K3  has 
highly  inflated  levels.  Since  in  practice  one  may  have  no  prior 
knowledge  of  the  degree  of  the  tail  weight  of  the  underlying 
distribution  it  is  recommended  that  the  LAV  estimation  be  used  in 
testing  x=0. 

The  summary  of  our  recommendations  for  testing  for  partial 
correlation  is  as  follows. 

1)  For  the  hypothesis  of  conditional  independence,  and  for  the 
hypothesis  t=0  when  (E,£  ) have  a very  light-tailed  distribution, 
we  recommend  the  use  of  the  usual  test  (Rj^)  based  on  Pearson's 
partial  correlation  coefficient. 

2)  For  the  hypothesis  x=0,  and  for  medium  to  heavy-tailed 
distributions  we  recommend  the  use  of  the  statistic  K*  obtained 
from  the  residuals  of  a LAV  fit  and  compared  to  the  cut-off  values 
given  in  table  5.2.  For  large  sample  sizes  (n>20)  we  suggest 
comparing  to  the  appropriate  critical  values  of  the  standard 


normal  distribution. 
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5.5  Related  Topics  for  Future  Research 

This  work  is  complete  only  in  the  sense  of  fulfilling  our  initial 
objective  of  studying  the  partial  correlation  problem  under  the  simple 
linear  setting.  However,  there  are  several  related  problems  which 
need  particular  attention  in  future  investigations.  For  example,  one 
may  study  the  partial  correlation  problem  when  each  of  Y and  Z are 
related  to  a p-variate  vector  X by  the  general  linear  model  or  by  some 
other  non-linear  or  functional  form.  For  the  simple  linear  setting 
one  may  investigate  classes  of  dependence  alternatives  other  than  the 
"trivariate  reduction"  model,  although  our  experience  shows  that  this 
by  no  means  in  an  easy  task  as  far  as  theory  is  concerned. 

Another  problem  of  interest  is  to  study  the  theoretical 
properties  of  the  statistics  and  K*  proposed  for  testing  the  null 
hypothesis  that  t=0.  For  example,  one  may  study  the  efficiencies  of 
such  tests  relative  to  the  other  tests  discussed  in  this  work  or 
investigate  their  empirical  performances  under  bivariate  distributions 
other  than  the  elliptically  symmetric  distributions  considered  here. 


APPENDIX 
COMPUTER  PROGRAM 


P HOG HA a ONE 


C 

c 
c 

C A PE0G3AM  TO  FIND  THE  NOLL  DISTHIBUTICN  OF  THE 

C STATISTIC  K*  FfiOM  THE  L.A.V.  FIT  OF  THE  LINEAR 

C MODELS: 

C Y = X ♦ E 

C Z = X + F , 

C WHERE  X HAS  THE  N (0,  1)  DISTRIBUTION  AND  (E,F) 

C HAS  THE  BIVARIATE  PEARSON  VII  DISTEIBDTICN 

WITH  ASSOCIATION  PARAMETER  RBO=0.0  AND  SHAPE 
C PARAMETER  NU=2.0: 

C 

C 

C 

INTEGER  NR, NS 
REAL  R(60) 

REAL  X(20),Y(20),  Z(20),  CC{20) 

REAL  E(20)  ,S(20)  ,rj  (20)  ,V  (20) 

READ  W1(20),  »2{20),  W3  (20) 

INTEGER  ID  (4500) 

INTEGER  IND(20),  ITER 
REAL  FF(20),  WT  (20)  , A,  B,  A 1 ,B  1 , A2 ,32 
DOUBLE  PRECISION  DSEED 1 , DSEED2 
C 

M=  10  000 
XM=FLOAT  (M) 

C 

NR=60 

NS=20 

DSEED1=145645.D0 

DSEEE2=123457.D0 

C 

DO  9 J=1,4500 
9 ID(J)=0 

C 

DO  777  K=1,M 
C 
C 
C 

CALL  GGNML  (DSEED  1 , NS , S) 

CALL  GGTTBS  (DSEED2,  NR,  R) 

N = 20 

XN=FLOAT (N) 

C 

NC2=XN»  (XN-1.  0)/2.0 
FNC2=FLOAT (NC2) 

C 

C 

C 

RHC=0.0 

RHC1=SQRT (1. 0-RHO++2) 

EX=-1.0 
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DO  25  1=1, N 
X(I)=S(I) 

n=2*i-i 

12=11+1 

W1  (I)  = (SQBT  ( (B  (II)  ♦♦EX)  -1.0)) 

♦ ♦COS  (2.0^3.  1416*S(I2)  ) 

«2  (I)=  (SQBT  ( (B  (II)  ♦♦EX) -1-0)  ) 

♦ ♦SIB  (2. 0»3. 1416+R(I2) ) 

H3  (I)=RHO^W  1 (I)  +BH01^W2  (I) 

Y (I)=X(I)+i»1  (I) 

Z(I)=X(I)  +H3  (I) 

25  CCMTINDE 

C 

CALL  DESL1  (Y ,X, N , A1 , B 1 , ITEB ,?F, IT ,IND, IFAOLT) 
CALL  DESL1  (Z,X,N,A2,B2, ITEB, PF, WI,IND, IFAOLT) 
C 

DO  17  1=1, N 

D (I)=Y  (I)-A1-B1+X  (I) 

V (I)  =Z  (I)  -A2-B2^X  (I) 

17  COMTINOE 

C 

c 

CALL  TAOHAT  (N , XN, 0, V,  SOMC, SSC) 

TAD=SDBC/  (XH*  (XB-1.0) ) 

C 

IF  (TAO.EQ.1.0)  TA0=0.999 
IF  (IAO.EQ.-1.0)  TAD=-0.999 
C 

ZSTA1=SSC/  (XN^  (XN-1.  0)  ♦ (XN-1 .0)  ) 
ZETA2=1.0-TAU^TAO 

ESTVAB=  (2,0^  (XB-2.0)  ♦ZSTA1  + ZSIA2)/FNC2 
C 

ZTAa=TAD/SQBT (SSTYAH) 

C 

C 

T=ZTA0+0.0005 
IT=INT (lOOO^T) 

IF  (IT. LT. 1000)  GO  TO  111 
IF  (IT. GT. 4499)  GO  TO  222 
C 

ID  (IT)=ID(IT)  +1 
GO  TO  777 

U 

111  ID  (999)  =ID  (999) +1 
GO  TO  111 

222  ID  (4500)  =ID  (4500) +1 
C 

c 

c 

111  CCNTINOE 
C 

ia=o 

n 

DO  333  J=  999,4500 


n n 
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I2=IQ  + ID  (J) 

HEITE  (6,444)  J,IQ 
444  fOBMAT  ('  »,2I10) 

333  CONTINDE 
STOP 
EHE 
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C 

C PHOGHAJi  TWO 

C 

C A PRCGSAM  WHICH  CALCULATHS  THE  FSEQOEIICIES  OF 

C REJECTING  THE  NOLL  HYPOTHESIS  OF  THE  INDEPEND. 

C BETWEEN  THE  ERROR  TEHHS  S AND  F.  THESE 

C FREQOENCISS  ARE  CALCULATED  FOR  THE  SEVEN 

C STATISTICS  UNDER  CONSIDERATI CN . THIS  PROGRAM 

C ALSO  CALCULATES  THE  NUMBER  OF  TIMES  ANY  TWO 

C OF  THESE  STATISTICS  REJECT  THE  NULL 

C SIMULTANEOUSLY: 

C 

INTEGER  NQ,  NR,  NS 

REAL  R{32),  X (8)  , K1  (3)  , W2{8),  W3{8) 

REAL  Y(8),  Z(8),  CC{8),S{3) 

REAL  U (8)  ,V  (8) 

REAL  El  (8)  , E2  (8) 

REAL  SS  (8)  ,ST  (8)  ,TT  (3)  , RET  (8)  , STN  (8)  ,PRN  (8) 
REAL  PTN(8) 

REAL  WK{96) 

INTEGER  MAT  (10,  10) 

LOGICAL  A,a,C,C1,C2,C3,D,E,F,G 
DOUBLE  PRECISION  BSESD 1 , D3 EED2 , DSEED3 
C 
C 

M=1000 
XM=FLOAT  (H) 

C 

C 

NR=3  2 

NQ=1 

NS=8 

DSEEE1= 143547. DO 
DSEED2=123457.D0 
DSEED3=154677.D0 
C 

DO  11  1=1,10 
DC  22  J=l, 10 
22  MAT(I,J)=0 

11  CONTINUE 
C 

c 

DO  777  K=1,M 
C 

c 

CALL  GGNML  (DSESD3 , NS, S) 

CALL  GGNML  (DSEED2, NR, R) 

N=a 

XN=FIOAT  (N) 

FNC2=XN» (XN-1- 0)/2.0 
C 

DELTA=1.7 

C 


o n n 
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DO  25  1=1, N 
X(I)=S(I) 

»1  (I)=R(I) 

J=S-H 

M2(I)=fi{J) 

JJ=2+?l  + I 
«3  (I)=R  (JJ) 

El  (I)=W1  (I)  +DELTA  + M3  (I) 

S2  (I)=W2(I)  +DELTA  + W3  (I) 

Y{I)=X{I)>E1(I) 

Z (I)=X  (I)  +E2  (I) 

25  CCNTINOE 

C 

C 

CALL  BETA  (N,XN,X,I, Z, 3HAT1 , BHAT2) 

DO  27  1=1, N 

0 (I)=Y(I)-BHAT1^X(I) 

V (I)  =Z  (I)-BHAT2^X  (I) 

27  CONTINOS 

C 

C 

CALL  TAUHAT  {N , XM ,U, V, SDUC, SSC) 

TA0K=SOaC/2.0 

C 

TAD=SaBC/ (XN* (XN- 1.0)  ) 

C 

IP  (TAU.EQ.  1,0)  TA0=0.999 
IP  (lAD, EQ.-1.0)  TAU=-0.999 
ZETA1=SSC/  (XN*  (XN-1.0)  ♦ (XN-1.0)  ) 

ZETA2=1-0-TA0*TAD 

VARHAT=  (2.  0*  (XN-2.  0)  *ZETA1  + ZEIA2)/FNC2 
C 

3TAHK=TAD/SgHT  (VARHAT) 

C 

C COMPARE  KENDALL'S  TAO  CALCOLAIED  ON  THE  RESIDOALS 

C ADJOSTED  BY  O.L.S.  ESTIMATORS  TO  TABLE  A. 21  0? 

C HOLLANDER  5 aOLFE,  TO  OOR  SIMULATED  TABLES,  AND 

C TO  THE  Z-TABLES  AFTER  STANDARDIZATION  BY  VARIANCE 

C UNDER  INDEPENDENCE: 

C 

C 

IF  (TAUK.EQ. 14.0)  CALL  GGDBS  ( DSEED1 , NQ, Q) 

A=  (TAUK.GE.  16.0.03.  (TAUK-EQ.  14-0. AND. 

♦ Q.LS.O. 826087) ) 

C 

B=  (TAUK.GE.  16.0.  OR.  (TAUK.  EQ.  1 4.  0.  AND. 

♦ Q.LE.  0.  284697)  ) 

C 

C=  (TA0K.GE-16.0.OR.  (TAUK.  BQ.  1 4.  0.  AND. 

♦ C-LE.O. 83408) ) 


IF  (A)  MAT  (1,  1)=HAT  (1,  1) +1 


u u 
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C 

C 

C 

c 

c 

cc 

c 


c 


c 

c 

c 

c 

c 

c 


c 

c 

c 

c 


IF 

(E) 

MAT  (2, 

,2)=MAT{2, 

2)  +1 

IF 

(C) 

MAT  (3, 

,3)=MAT(3, 

3)H 

IF 

(A. 

AND.  B) 

MAT  (1  ,2)  = 

MAT  (1  < 

,2) 

+1 

IF 

(A. 

AND.C) 

MAT  (1,3)  = 

MAT(1, 

r3) 

+ 1 

IF 

(E. 

AND.C) 

MAT  (2,3)  = 

MAT (2, 

r3) 

♦ 1 

CCHPAGE  K*  TO  THE  SI.'IULATSD  NOLL  DI5TNS; 

1)  FRCS  THE  OHDINABY  COBE.  PROBLEM 

2)  FBOa  OLS  BESIDOALS  (NORMALITY) 

3) FB0a  OLS  BESIDOALS  (PEABSON  VII); 


C1  = 

■ (STARK 

.GE. 

,1.78) 

C2= 

■ (STARK 

.GE, 

, 1.98) 

C3= 

• (STARK 

.GE. 

.2.437) 

IF 

(Cl)  MAT  (4, 

,4)=MAT(4, 

4)  + 

1 

IF 

(C2)  MAT (5, 

,5)=MAT(5, 

5)  + 

1 

IF 

(C3)  MA 

T(6, 

,6)=MAT  (6, 

6)  + 

1 

IF 

(A. 

AND. 

Cl) 

MAT  ( 1,4)  = 

MAT 

(1,4) 

+ 1 

IF 

(A. 

AND. 

C2) 

MAT  (1,5)  = 

MAT 

(1,5) 

+ 1 

IF 

(A. 

AND. 

C3) 

MAT  ( 1,6)  = 

MAT 

(1,5) 

+ 1 

IF 

(E. 

AND. 

Cl) 

MAT  (2,4)  = 

MAT 

(2,4) 

+ 1 

IF 

(E. 

AND. 

C2) 

MAT  (2,5)  = 

MAT 

(2,5) 

+ 1 

IF 

(E. 

AND. 

C3) 

MAT  (2,6)  = 

MAT 

(2,6) 

+ 1 

IF 

(C. 

AND. 

Cl) 

MAT  (3,4)  = 

MAT 

(3,4) 

+ 1 

IF 

(C. 

AND. 

C2) 

MAT  (3,5)  = 

MAT 

(3,5) 

+ 1 

IF 

(C. 

AND. 

C3) 

MAT  (3,6)  = 

MAT 

(3,6) 

+ 1 

IF 

(Cl 

.AND 

.C2) 

MAT  (4,5) 

= MAT  (4,  5)  + 

IF 

(Cl 

.AND 

.C3) 

MAT  (4,6) 

= MAT  (4,6)  + 

IF 

(C2 

.AND 

.C3) 

MAT  (5,6) 

= MAT  (5,  6)  + 

COMPARE  PEABSON *S  R WITH  STODENT-T  WITH  N-3  DF, 
CALL  JACK  (N,XN,D,7,SYY,SZZ,SYZ,BYZ,TN,7RN,VTN) 
IF  (BYZ.GE- 0-  999)  BYZ=0.99 


HNT=BYZ*SQRT ( (XN- 3 . 0) / ( 1 . 0-R YZ**2) ) 


D= 

(BNT.GE 

.2. 

015) 

IF 

(E) 

MAT 

(7, 

7)=MAT(7, 

7)+1 

IF 

(A. 

AND. 

D) 

HAT  ( 1,7)  = 

MAT(1, 

7)+l 

IF 

(E. 

AND. 

D) 

MAT  (2,7)  = 

MAT  (2, 

7)+1 

IF 

(C. 

AND. 

D) 

HAT  (3,7)  = 

MAT  (3, 

7)+1 

IF 

(Cl 

. AND 

.D) 

MAT  (4,7) 

=MAT(4 

,7)+1 

IF 

(C2 

.AND 

.D) 

MAT  (5,7) 

=MAT (5 

,7)  +1 

IF 

(C3 

.AND 

.D) 

MAT  (6,7) 

= MAT  (6 

,7)  +1 

COMPARE  TOE  TBANSFCRMED  FISHEB'S  Z 
STANDARDIZED  BY  ITS  VARIANCE  l/N-3  TO  Z 0.05; 


n n ^ n n 
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ZTN=SQRT  ( (XN-3.0) ) ^TH 


E= 

(ZTN.GE.  1. 

645) 

IF 

(E) 

MAT (8, 

8)  =MAT 

(8, 

8)  + 1 

IF 

(A. 

AND.E) 

MAT  ( 1 , 

8)  = 

MAT(1 , 

8)  +1 

IF 

(B. 

AND.  E) 

HAT (2, 

8)  = 

MAT(2, 

8)  +1 

IF 

(C. 

AND.  E) 

MAT  (3, 

8)  = 

MAT(3, 

8)  +1 

IF 

(Cl 

.AND.E) 

MAT  (4 

»8) 

= MAT  (4 

,8)+1 

IF 

(C2 

.AND.E) 

MAT  (5 

,8) 

= MAT  (5 

,8)+1 

IF 

(C3 

.AND.  E) 

MAT  (6 

.8) 

= HAT  (6 

,8)  +1 

IF 

(£. 

AND.E) 

MAT  (7, 

8)  = 

MAT  (7, 

8)  +1 

C 

C 

C OSJ  THE  JACKKNIPE  ESTIHATORS  OF  THE  VABIANCES  CP 

C OP  PEARSON'S  a,  AND  FISHER'S  Z,  AND  COHPAHE  TO 

C Z_0.05: 

C 

C 

SDHN  = SQET  (VHN/XN) 

SDTN  = SQRT  (VTN/XN) 

RNJACK=3yZ/SDaN 

TNJACK=TN/SDTN 

C 

F=  (BNJACK.GE.  1.645) 

G=  (INJACK. GS. 1.645) 

C 

IP  (F)  MAT  (9,9)=«AT{9,  9) +1 
IF  (G)  HAT  {10,10)=MAT{10,10) +1 
IF  (A.AND.F)  HAT  (1  ,9)  = HAT  (1 ,9) +1 
IP  (E.AND.P)  BAT  (2,9)  = MAT{2,9) +1 
IF  (C.AND.F)  MAT  (3,9)  = MAT  (3, 9) +1 
IP  (Cl. AND. F)  MAT{4,9)=MAT(4,9) +1 
IF  (C2.AND.F)  MAT  (5,9)  =MAT  (5,9) +1 
IF  (C3-AND.F)  MAT(6,9)  =MAT(6,9) +1 
IF  (C.AND.F)  MAT  (7,9)  = MAT(7,9) +1 
IP  (E.AND.P)  MAT  (8,9)  =MAT  (8,  9) +1 
IF  (A.AND.G)  MAT  (1,10)=MAT(1 ,10) +1 
IP  (B.AND.G)  MAT  (2,  10)  =MAT  (2,  10)  + 1 
IF  (C.AND.G)  MAT(3,10)=MAT(3,10)  + 1 
IF  (Cl.  AND, G)  MAT  (4,  10)  =MAT  (4,  10)  + 1 
IF  (C2.AND.G)  MAT  (5,  10)=MAT  (5,  10) +1 
IF  (C3.AND.G)  MAT  (6,  10)=MAT  (6,  10) +1 
IF  (D.AND.G)  MAT(7,10)=MAT(7,10)+1 
IF  (E.AND.G)  MAT  (8,  10)  =MAT  (8 , 10)  + 1 
IF  (F.AND.G)  MAI  (9,10)  =MAT(9,10) +1 


CCNTINDE 


DO  888  1=1,10 

WRITE  (6,123)  (MAT(I,J),  J=1,10) 
123  FCHMAT  ('-',1018) 

888  CCNTINUE 
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STCF 

ENE 


nnn  uinnoto  -*  nno 
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C 

c 

c 

c 

c 

c 

c 

c 


THIS  SOBHODTINE  CALCDLATES  THE  LEAST  ABSOLUTE 
VALUE  (LAV)  PABAaETEH  BSTiaATES  OF  THE  SI3PLE 
LINEAB  aODEL  (JCSVAHGEH  AHD  SPOSITC,  1983); 


SUBBOUTINE  ONE 


SUBSCUTINE  DESLl  (I  ,X , N , A, B , ITEE, FF ,»T , INC , IF AULT) 


INTEGEB  IHD  (N) , ITEB 

HEAL  X{20),  Y{20),  FF(20),  WT(20),  A,  E,A1,A2,B1 
REAL  B2 

DATA  TOL/1-OE-6/ 

XN=FLOAT (N) 

FIND  ESTIHATES  OF  A AND  B 

I?AULT=0 
SY=0.0 
A 1=X  (1) 

DO  10  1=1, N 

IF  (X(I)-NE.AI)  IFA0LT=1 
SY=SY+Y  (I) 

0 CONTINUE 
B=0. 

A=SS/XN 

IF  (IFAULT.EQ.O)  RETURN 

ITEE=0 

IEESA=1 

ISES£=0 

DEV=ABS  (Y  (1)-A) 

DO  20  J=2,N 

IF  (ABS  (Y  (J)-A)  .GE.  DEV)  GO  TO  20 
DEV=ABS  (Y  (J)-A) 

IBESA=J 
0 CONTINUE 

HECCHD  X AND  Y COERESP.  TO  THE  BIN.  ABS.  DEVIATION 

0 J=IBESA 

XJ=X  (J) 

YJ=Y  (J) 

1=1 

a=i 

K=N 

THL=0. 

TBD=0. 

SEPEHATE  SLOPE  VALUES  > B FRCa  THOSE  < 3 


C 


40 


XI J=  (X  (l)-x  (J)  ) 

IF  (XIJ.EQ.O.)  GO  TO  60 
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FI=  (Y  (I)-YJ)/XIJ 

IF  (XIJ.LT.O.)  XIJ=-XIJ 

IF  (FI.GT.B)  GO  TO  50 

FF  {B)  = FI 

WT  {f!)=XIJ 

IWD  (M)=I 

THl=TiL+XIJ 

M=H+  1 

GO  TC  60 

50  FF{K)=FI 

HT  (K)=XIJ 
IHD  (K)=I 
Tiio=aau4-xiJ 
K=K-1 

60  IF  (I.EQ.N)  GO  TO  70 

1=1+1 
GO  TC  40 

C 

C SET  THE  NEH  B VALOE  = HEIGHTED  BECIAH  SLOPE 

C 

70  ASCM=  (TWL+TH0)/2. 

IF  (TWL-GE.TaU)  GO  TO  130 
M=B-1 

80  K=K+1 

M=M+1 
I=K 

90  FNES=FF  (I) 

INEH=I 

100  I?  (I.EQ.N)  GO  TO  110 

1 = 1+1 

IE  {FF  (I)  .LT.FNEa)  GO  TO  90 
GO  TC  100 

110  T«I=TliL  + aT  (INEa) 

IF  (laL.GE.ASOM)  GO  TO  120 
FF  (IHEH)=FF  (K) 

WT  (INEH)  =«T  (K) 

ISE  (IHEa)=IND  (K) 

GO  TC  80 

120  BNEW=FNEW 

JNES=IHD (ISE») 

GO  TC  180 

130  M=H-1 

i=a 

140  ENEa=FF(I) 

INES=I 

150  IF  (I.EQ.  1)  GO  TO  160 
1=1-1 

IF  (FF  (I)  .GT.FNEH)  GO  TO  140 
GO  TC  150 

160  TWD=T«D+MT (INEW) 

IF  (TiO.GT. ASOM)  GO  TO  170 
FF  (IHE»)=FF  (B) 

WT  (INEW)=WT  (B) 

IND  (INEW)  =IND  (M) 


nnno  onnnn  -*nnnn 
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GO  TC  130 
170  BNZ»=PNEH 

JNE«=IHD  (laEH) 

FIND  NEW  INTEBCEPT  7ALDE 
CHANGE  ITEBATICN  COUNT 

80  ITEE=ITER+1 
A=YJ-BNEW*XJ 

TEST  ONE  FOE  SOLUTION: 

COflPAHE  DIFFEEENCE  IN  B VALDES  TO  TOLEBANCE 
LEVEL 

IF  (ABS  (B-BNEi)  .LE.TOL)  GO  TO  190 
B=ENEH 

TEST  TWO  FOB  SOLUTION: 

CHECK  FOB  HEPITITION  IN  INDEX  PAITEBN 

IF  {IBESB. SQ. JNEW)  GO  TO  190 
IRESE=IBESA 
IRESA=JNEW 
GO  TC  30 
190  RETURN 
END 
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C 

C SOBHOUTINE  THO 

C 

C THIS  SOBBODTINE  CALCULATES  KENDALL'S  TAU.  THE 
C NUaBERS  OBTAINED  THROUGH  THIS  SUBBCUTINE  ABE 

C USED  IN  THE  SAIN  PROGRAM  TO  FIND  K»: 

C 

SUBROUTINE  TAUHAT  (N , XN, I, Z , SU MC, SSC) 
DIMENSION  Y(8),  Z (8)  ,CC  (8) 

SDI!!C=0.0 

SSC=0.0 

C 

DO  1 1=1, N 
COUNT=0.0 

DC  2 J=1,N 

IIJ=Y(I)-Y{J) 

ZIJ=Z  (I)-Z  (J) 

IP  { (YIJ*ZIJ)  .GT.  0.0)  COUNT=COUNT+1.0 
IF  ( (YIJ*ZIJ) ,LT. 0. 0)  CCONT=CCUNT-1. 0 
2 CCNTINUE 

CC (I)=COUNT 
SOMC=SUaC+CC  (I) 

C 

1 CCNTINUE 

C 

CBAE=SUMC/XN 

C 

DO  3 1=1, N 

SSC=SSC+  (CC  {I)-CBAS)  ♦ (CC  (I)  -CBAR) 

REIUBN 
END 


n n n n 
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SDBRODTINE  THREE 


C 


5 


6 

C 


C 

C 


SDBBCOTINE  JACK 
♦ TN,VHN,VTN) 
DIMENSION  Y (20)  , 
DOUBLE  PRECISION 
DOUBLE  PRECISION 
DOUBLE  PRECISION 
DOUBLE  PRECISION 
DOUBLE  PRECISION 
DOUBLE  PRECISION 
DOUBLE  PRECISION 
DOUBLE  PRECISION 


(N,XN, Y,Z,SIY,SZZ,SYZ,RYZ, 

Z (20) 

CXN,DY (20) ,DZ (20) ,DSYY,DSZZ 
DVRN,DVTN,SS(20)  ,ST(2Q) 
STN(20)  ,PRN  (20)  ,PTN  (20)  ,DTN 
SUM3,SOM4,S0aPEN,SUaPTN 
BARY, BARZ 

DSYZ,DRYZ,TT(20)  ,BST(20) 
SDBUSUM2,SAVEY,SA7EZ,S2,  T2 
SUMY, SUMZ, YEAR, ZB AR 


SUMY=0,D0 

suaz=o. DO 

DSYY=0. DO 
DSZZ=0.D0 
DSYZ=0.D0 
SUM3=0. DO 
SUM4=0. DO 
SUHPBN=0.D0 
SUI!ITN=0.D0 
DXN=XN 
DC  5 1=1 ^N 
DY(I)=Y(I) 

DZ(I)=Z(I) 

SOHY=SUaY+DY  (I) 

SUMZ=SUMZ+DZ (I) 

CONTINUE 
YBAB=SUMY/DXN 
ZBAfi=SUHZ/DXN 
DC  6 1=1, N 

DSYY=DSYY-»-  (DY  (I)-YBAH)  ♦ (DY  (I) -YEAR) 
DSZZ=DSZZ+  (DZ  (I)-ZBAR)  ♦ (DZ  (I)  -ZEAR) 
DSYZ=DSYZ+  (DY  (I)  -YBAR)  ♦ (DZ  (I)  -ZEAR) 
CONTINUE 


DRYZ=DSYZ/DSQBT (DSYY»DSZZ) 

DTN=0.5D0*  (DLOG  (l.DO+DRYZ)  -DLOG  (1  .CO-DBYZ)  ) 

HYZ=DRYZ 

DO  10  1=1, N 
SA7EY=DY  (I) 

S2=SA7EY 
DY  (I)  = 0.D0 
SA7EZ=DZ  (I) 

T2=SA7EZ 
DZ  (I)=0.D0 
SUMl=O.DO 
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SDH2=0.D0 

C 

DO  n J=1,N 

SDMl  = SOJn  + DY  (J) 

SDM2=SOM2+DZ (J) 

11  CCSTINDE 
C 

DY  (I)=SAVEY 
DZ  {I)=SAVEZ 
C 

BABY=S0M1/  {DXN-1.D0) 

BABZ=SOfl2/  (DXN-1.D0) 

SS  (I)=DSIY-  ( (DXa-I.DO)  /DXN)  ♦ (BABY-S2)  ♦ (EARY-S2) 
TT  (I)  = DSZZ-  ( (DXN-1.DO)  /DXN)  ♦ (BARZ-T2)  * (BARZ-T2) 
ST  (I)=DSYZ-  { (DXN-l.DO)  /DXN)  ♦ (BABY-S2)  ♦ (BARZ-T2) 
RST  (I)  =ST  (I)  /DSQBT  (SS  (I)  *TT  (I)  ) 

STN  (I)=0.5D0*  (DLOG  (1.DO+HST  (I)  )-DLCG  (1.DO- 
♦ BST(I))) 

PEN  (I)  = DXN^DHYZ-  (DXN-1  .DO)  ♦RST  (I) 

PTN  (I)  =DXN*DTN-  (DXN-1.  DO)  *STN  (I) 
SUMPEN=SONPRll+PHN  (I) 

SOHPTN=SDMPTN  + PTN  (I) 

C 

10  CONTINUE 

C 

PENEAR=SDBPRN/DXN 
PTNEAR=SUMPTN/DXN 
DO  12  J=1,N 

SDM3=S0I13+  (PHN  (J)  -PRNB  AR)  ♦ (PEN  (J)  -PRNBAR) 
SDf!4=S0M4+  (PTN  (J)  -PTNB  AR)  ♦ (PIN  (J)  -PTNBAR) 

12  CONTINUE 
DVRN=SUM3/  (DXN-1.  DO) 

DVTN=SUa4/  (DXN-1. DO) 

C 

TN=DTN 

VBN=DVRN 

VTN=DVTN 

RETURN 

END 

C 
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